Thursday, May 27, 2004

CIF Updates version 4

  Back in mid-March I was working on cleaning up our customer information file. I wrote a couple of posts describing my attempts to execute the updates on the AS/400.
  I ran into a wall where the sort of update that worked so elegantly on SQL Server would not run on DB2/400. I ran into another wall after I wrote a procedure using a cursor on the 400 only to find that the "CALL" verb was not allowed to me.
  The project lay dormant until I got a call a few days ago from Jerry in marketing who "thought this update would happen at the time of the May pre-merger maintenance."

  I asked Bob to show me the scripting software. All this does really is push keystrokes to the emulator. We spent a couple hours setting up a script to enter the branch and officer updates into the appropriate screens. When we were done, we found that it ran at the blazing speed of about forty records per minute. For one bank alone I have over 106,000 records to update.
  Bob suggested that I look back into some SQL method of doing the updates.
  I wrote a little bit of code in MS Access to write an Update statement for each record that I wanted to update. It looked kind of like this:

Sub cleanup()
Set db = CurrentDb()
Set qdef = db.QueryDefs("CleanupCIF")
Set rs = db.OpenRecordset("cifclean")

Do While Not rs.EOF
qdef.SQL = "UPDATE CORE.CIFFILE SET CUPOFF='" & rs!Officer _
& "', CUBRCH=" & rs!Branch & " WHERE CUNBR='" & rs!Number & "'"
db.Execute "CleanupCIF"
rs.MoveNext
Loop
End Sub


  The "CIFCleanup" query had the ODBC connection string and some other settings saved with it that saved me some code. I ran this code for the same 200 records which we tested on the scripting system and it took about five and a half minutes. It was slower than the scripting stuff.
  I was disappointed, but not too surprised. I know that even though these are passthrough queries Access and ODBC are wrapping their layers around the queries and packaging them up to be translated a few times before actually being executed on the AS/400 and then returning completion codes.
  I theorized that I could speed up the process by nailing up the connection. One less thing for it to have to accomplish each time should speed it up. My code mixed DAO and ADO like this:
Sub cleanup()
Dim conn As New ADODB.Connection
Set db = CurrentDb()
Set rs = db.OpenRecordset("cifclean")
conn.Open "DSN=CORESYS"
Do While Not rs.EOF
strSQL = "UPDATE CORE.CIFFILE SET CUPOFF='" & rs!Officer _
& "', CUBRCH=" & rs!Branch & " WHERE CUNBR='" & rs!Number & "'"
conn.Execute strSQL
rs.MoveNext
Loop
conn.Close
End Sub

  This ran even slower than the previous attempt, coming in at a little over six minutes. Apparently Access was better at handling the connection stuff than I was.
  I went home for the day and let the problem percolate.
  This morning I decided to leverage SQL to my advantage. I couldn't make it work with one simple statement, but I didn't have to settle for 106,000 statements either. There could be a compromise.
  This would require a little more code, but nothing horrible:
Sub clean()
Set db = CurrentDb()
Set qdef = db.QueryDefs("CleanupCIF")

'Update groups of officers at a maximum of 100 at a time.
Set rsGrp = db.OpenRecordset("SELECT Officer FROM cifclean GROUP BY Officer")
Do While Not rsGrp.EOF
strSQL = "SELECT Number FROM cifclean WHERE Officer='" & rsGrp!Officer & "'"
Set rs = db.OpenRecordset(strSQL)
strSQL = ""
intX = 0
Do While Not rs.EOF
intX = intX + 1
strSQL = strSQL & "'" & rs!Number & "',"
If intX = 100 Then
strSQL = "UPDATE CORE.CIFFILE SET CUPOFF='" & rsGrp!Officer _
& "' WHERE CUNBR IN (" & Left(strSQL, Len(strSQL) - 1) & ")"
qdef.SQL = strSQL
db.Execute "CleanupCIF"
strSQL = ""
intX = 0
End If
rs.MoveNext
Loop
If intX > 0 Then
strSQL = "UPDATE CORE.CIFFILE SET CUPOFF='" & rsGrp!Officer _
& "' WHERE CUNBR IN (" & Left(strSQL, Len(strSQL) - 1) & ")"
qdef.SQL = strSQL
db.Execute "CleanupCIF"
End If
rsGrp.MoveNext
Loop

'Repeat for branch

...

End Sub


  Knowing that the lists of unique officers and branches were considerably shorter than the distinct customer ID's, I grouped them together. This code will will create one query for each unique officer ID, unless there are more than 100 customers to be updated to that officer in which case it will pakage the updates in groups of 100.
  This process was much quicker. In fact, I was able to run not 200, but 2,000 in under five minutes. I ran 10,000 updates in about sixteen and a half minutes. I should be able to accomplish the full set of 106,000 updates in about three hours.
  This beats the heck out of the forty plus hours that the scripting would have required. That couldn't have been a concurrent forty-some hours either since it required an interactive session. Each night during nightly processing those sessions become unavailable.
  Today, I just ran against the test bank. If everyone is satisfied with the results tomorrow, I'll work on the getting proper permissions to update the production database.

Posted to Work Life at May 27, 2004 2:22 PM