Hi all,
Short intro on what I try to do and how I have built it: I need to read a SharePoint list with +6000 listitems and I need to get the versionhistory of each item, which actually means requesting versioning-data for each field of each listitem. Long story
short: this results in about 620000 requests that have to be made to the SharePoint Webservices. Doing this in a single-threaded setup would take forever to load, so multi-threading is the way to go. Also, this is still on on Sql 2008 R2, which means .NET
2.0, which means that I can't use the new parallel processing stuff they've added to .NET 4.0, so I figured using the ThreadPool was my easiest way forward.
So, in my main method, I enumerate over all the fields of the listitems, and I made a method (which I'll queue as workitems to the threadpool) that takes the listID, listItemID and fieldname as parameters and that gets the versionhistory of that field. After
it has read the versionhistory, it adds this data in OutputBuffer of the ScriptComponent.
I also created a TaskInfo Class which holds the data object that pass to my ProcessListItemField method when I queue it for the ThreadPool.
So, first, in my main method:
foreach (XmlNode ListItem in results.SelectNodes("descendant::z:row", NamespacesMgr))
{
ProcessListItem(ListItem, ListUID);
}
Which executes this method:
private void ProcessListItem(XmlNode listItem, String ListUID)
{
String ListItemID = listItem.Attributes.GetNamedItem("ows_ID").Value.ToString();
String ListItemCreatedDateTime = listItem.Attributes.GetNamedItem("ows_Created").Value.ToString();
foreach (XmlAttribute ListItemField in listItem.Attributes)
{
//This is where I add the task to get and process the listitems versionhistory to the ThreadPool
ThreadPool.QueueUserWorkItem(new WaitCallback(ProcessListItemField), new TaskInfo(Int32.Parse(ListItemID), ListItemField.Name, ListUID, DateTime.Parse(ListItemCreatedDateTime)));
workitems = workitems + 1;
}
}
As you can see in the above snippet, I keep a count of the number of workitems I queue in the integer 'workitems'. This integer is decreased with 1 at the end of each ProcessListItemField that I added to queue.
So, this adds the following method as a task to queue to be executed by one of the available threads in the ThreadPool:
public void ProcessListItemField(Object taskinfo)
{
IDTSComponentMetaData100 myMetadata = ComponentMetaData;
bool fireagain = false;
try
{
TaskInfo taskInfo = (TaskInfo)taskinfo;
XmlNode ListItemFieldVersionCollection = ListsService.GetVersionCollection(taskInfo.ListUID, taskInfo.ListItemID.ToString(), taskInfo.ListItemField);
foreach (XmlNode ListItemFieldVersion in ListItemFieldVersionCollection.ChildNodes)
{
Output0Buffer.AddRow();
Output0Buffer.ListItemID = Int32.Parse(taskInfo.ListItemID.ToString());
Output0Buffer.FieldName = taskInfo.ListItemField; //varchar(500)
Output0Buffer.Modified = DateTime.Parse(ListItemFieldVersion.Attributes.GetNamedItem("Modified").Value.ToString()); //varchar(50)
Output0Buffer.Editor = ListItemFieldVersion.Attributes.GetNamedItem("Editor").Value.ToString(); //varchar(1000)
Output0Buffer.Value = (taskInfo.ListItemField == "ows_Created" ? taskInfo.ListItemCreatedDateTime.ToString() : ListItemFieldVersion.Attributes.GetNamedItem(taskInfo.ListItemField).Value.ToString()); //varchar(Max)
}
workitems = workitems - 1;
}
catch(Exception ex)
{
myMetadata.FireWarning(0, "ScriptComponent", "Error while processing ListItemID " + ((TaskInfo)taskinfo).ListItemID.ToString() + " - " + ex.Message, String.Empty, 0);
}
}
Finally, in my main method I do the following to make sure that the main method does not finish before all workitems have been processed by the background threads:
while (workitems > 0)
{
int avWorkerThreads;
int avCompletionPortThreads;
ThreadPool.GetAvailableThreads(out avWorkerThreads, out avCompletionPortThreads);
myMetadata.FireInformation(0, "ScriptComponent", " [" + DateTime.Now.ToString("G") + "] - There are currently " + avWorkerThreads.ToString() + " WorkerThreads available.", String.Empty, 0, ref fireagain);
myMetadata.FireInformation(0, "ScriptComponent", " [" + DateTime.Now.ToString("G") + "] - There are currently " + avCompletionPortThreads.ToString() + " CompletionPortThreads available.", String.Empty, 0, ref fireagain);
myMetadata.FireInformation(0, "ScriptComponent", " [" + DateTime.Now.ToString("G") + "] - There are currently " + workitems.ToString() + " workitems in the Threadpool.", String.Empty, 0, ref fireagain);
Thread.Sleep(10000);
}
This results in a log that looks like this when I'm debugging:
![]()
This now all works fine except two issues I'm facing:
- As you can see, while I have more than enough work in the queue and still +1700 threads available, performance is still pretty bad, about 60-70 requests per 10sec, is there any way to boost this?
- The script randomly exits: no error, no warning, no exception or whatever, it just exits and says the package is ready. It seems it entirely ignores the while-loop that I inserted in the main method to wait for all the workitems to finish. The screenshot
above shows this: no error or finish message at the end of the scriptcomponent-logging while VS shows the message that the job is done. Any ideas on what might be the issue here?
Thanks!
C