Monday, January 31, 2011

More fun with Linq - DateTime Quarters

Here's a fun little query to get the quarters for a calendar year... Put this straight into LinqPad and check it out. Of course, LinqPad still requires EOL underscores for line continuations hence the syntactic clutter.

Dim quarters = _
   From q In Enumerable.Range(1, 4).Select(Function(x) New DateTime(DateTime.Now.Year, ((x-1) * 3) + 1, 1)) _
   Select New With { _
      .QtrStart = q, _
      .QtrEnd = q.AddMonths(3).AddMilliseconds(-1) _
   }
quarters.Dump()

Thursday, January 27, 2011

Pratical Linq

So my last post on Linq is a bit short on details.  Here's a problem I was recently trying to solve.  Our company's 4-digit postal code extension changed, and our corporate Outlook signature files all needed updated.  There doesn't appear to be any easy way to do this via any of the tools we have.  However, we are using roaming profiles in Windows so all of our corporate profiles are in one common network location.  The trouble is, how do we query for all the signature files and how do we do it FAST?

There are 250,000+ files in 100,000+ folders in the roaming profiles area.  Imagine how long this would take with a simplistic System.IO.Directory.GetFiles() call that then dug through 250,000+ files trying to find what I wanted.  No way!

Linq to the rescue!

First off, Outlook signatures are located off the Application Data folder where there are a ton of other files.  On our network, all users' Windows application profile folders are here: "\\serverXYZ\RoamingData\Applications\".  From this root directory, you get into the users' directories and then each user has their own signature folder.  So, John Doe's signatures are here: "\\serverXYZ\RoamingData\Applications\John.Doe\Application Data\Microsoft\Signatures\" and Jane Doe's are here: "\\serverXYZ\RoamingData\Applications\Jane.Doe\Application Data\Microsoft\Signatures\".  I don't want to search every single directory for sigs since I should be able to get into only the directories I need.  So here's the Linq query I used to get there - by the way, this is all in VB.NET using Option Infer:

Dim baseDir = "\\serverXYZ\RoamingData\Applications"
Dim sigDirs =
    From a In Directory.GetDirectories(baseDir)
    Let sigPath = New DirectoryInfo(Path.Combine(a, "Application Data\Microsoft\Signatures"))
    Where sigPath.Exists
    Select sigPath

By habit and convention, I tend to name my selection variables in my Linq queries a, b, c, etc...  I used the magic "Let" keyword to assign a variable in the middle of my query.  This query is pretty simple - it starts off in the base directory I mentioned earlier to get the users' directories, and then it assembles their signature directory and returns an IEnumerable(Of System.IO.DirectoryInfo) of all the paths that exist.

Now, I need to get the signature files themselves.  Outlook makes 3 signature files, one in text format, one in rich text format, and an HTML one.  I decided to hinge everything off of finding the .txt file first.  I used the little understood and slightly dreaded SelectMany() extension.  If you are unfamiliar, here's a better explaination that I could hope to give.  So, here is the query to get the sig files:

Dim sigFiles =
    From a In sigDirs.SelectMany(Function(d) Directory.GetFiles(d.FullName, "*.txt"))
    Let txt = New FileInfo(a)
    Let rtf = New FileInfo(Path.ChangeExtension(a, "rtf"))
    Let htm = New FileInfo(Path.ChangeExtension(a, "htm"))
    Where rtf.Exists _
    AndAlso htm.Exists
    Order By txt.FullName
    Select New With {.Txt = txt, .Rtf = rtf, .Htm = htm}

This query looks in all the users' signature directories and gets all the .txt files from that directory where there's also an RTF and an HTM version of the same file.  It returns an IEnumerable(Of ) with .Txt, .Rtf, and .Htm System.IO.FileInfo properties.

Now, I have all the signature files.  And I know that they are stored in the Latin-1 format, so I'm going to iterate through them and do the replacement.  I'm making some assumptions here, like that the zip codes in the RTF and HTM files aren't bisected by style elements, but based on analysis of the files, I know that to be a reasonable tactic.

So here's the final VB.NET method in its 30 lines of splendor:

Public Shared Sub DoSignatureFileUpdate(ByVal oldZip As String, ByVal newZip As String)
    Dim baseDir = "\\serverXYZ\RoamingData\Applications"
    Dim result As New List(Of String)()

    Dim sigDirs =
        From a In Directory.GetDirectories(baseDir)
        Let sigPath = New DirectoryInfo(Path.Combine(a, "Application Data\Microsoft\Signatures"))
        Where sigPath.Exists
        Select sigPath

    Dim sigFiles =
        From a In sigDirs.SelectMany(Function(d) Directory.GetFiles(d.FullName, "*.txt"))
        Let txt = New FileInfo(a)
        Let rtf = New FileInfo(Path.ChangeExtension(a, "rtf"))
        Let htm = New FileInfo(Path.ChangeExtension(a, "htm"))
        Where rtf.Exists _
        AndAlso htm.Exists
        Order By txt.FullName
        Select New With {.Txt = txt, .Rtf = rtf, .Htm = htm}

    For Each fnfo In sigFiles.SelectMany(Function(x) {x.Txt, x.Rtf, x.Htm})
        Dim latin1 = System.Text.Encoding.GetEncoding(1252)
        Dim contents = File.ReadAllText(fnfo.FullName, latin1)
        Dim newContents = contents.Replace(oldZip, newZip)
        If newContents <> contents Then
            File.WriteAllText(fnfo.FullName, newContents, latin1)
            Console.WriteLine(fnfo)
        End If
    Next
End Sub

Really powerful stuff.  This whole process took less than 2 seconds to run.

Wednesday, January 19, 2011

Linq is awesome

I have said this at work at least once a day for the past 3 years.  It makes my job so much easier.  How did I ever manage without it?  Linq is AWESOME.  That is all.