Monday 9 March 2009

Function for returning the nth occurrence of a substring in Excel

Here’s another bit of code that you’re either looking for or you’re not.  In all honesty, I’m posting this mainly because I use it again and again, and having the code here means I can always get to it.

So, where the built-in INSTR function in Excel VBA (or the FIND worksheet function) returns the position of a substring within a string (e.g. looking for the e in Apple would return 5), InStrNr extends the functionality by returning the nth occurrence of a substring within a string (so looking for the second p in Apple would return 3).

I mostly use this when a string includes a delimiter of some sort and need a way to find the right start position for a MID() function to grab just part of that string, so for example getting back the day from a m/d/yyyy date*. (This is something I do a lot because I’m in the UK where we have dd/mm/yyyy and end up converting by hand when importing from CSV files and the like if Excel won’t figure it out for us.)

Hopefully the argument names should be self explanatory, but to use my earlier example of returning the second p in Apple you’d use

=InStrNr(“Apple”,”p”,2)

Anyway, here’s the code:

Function InStrNr(FindIn As String, TextToFind As String, OccurrenceToFind As Integer) As Integer

Dim ThisOccurrenceNum As Integer, ThisOccurrencePos As Integer

Do

    ThisOccurrenceNum = ThisOccurrenceNum + 1

    ThisOccurrencePos = InStr(ThisOccurrencePos + 1, FindIn, TextToFind)

    If ThisOccurrenceNum = OccurrenceToFind Then

        InStrNr = ThisOccurrencePos

    End If

Loop While ThisOccurrencePos > 0 And ThisOccurrenceNum < OccurrenceToFind

End Function

As with most of my quick-and-dirty code there’s no error handling or validation of inputs, so if the function doesn’t get the arguments it’s expecting you’ll get an error of some sort.  Feel free to post a comment if you get stuck and I’ll try to help so much as my meagre coding skills allow!

* For a mm/dd/yyyy date in cell A1 the day would always be =VALUE(MID(A1,4,2), but with m/d/yyyy the ‘4’ could be a ‘3’ depending on whether the month has one or two digits.  With a m/d/yyyy date I’d use =VALUE(MID(A1,InStrNr(A1,”/”,1),InStrNr(A1,”/”,2)-InStrNr(A1,”/”,1)-1)).  Which looks pretty horrible at first glance but all we’re doing is using the position of the two slash characters to work out where the day part of the date starts and finishes.

2 comments:

  1. I came across a similar issue, but needed an offset parameter as well (for the first search), so ended up writing my own version. It's slightly less verbose as well, so I figured it's worth sharing for others looking for a solution.

    Function InStrNr(ByVal offset As Long, haystack As String, needle As String, Optional occurrence As Integer = 1) As Long
    Dim i As Integer
    For i = 1 To occurrence
    offset = InStr(offset + 1, haystack, needle)
    Next i
    InStrNr = offset
    End Function

    ReplyDelete
  2. Thanks Tycho, always glad to have my scripts improved upon. And I love your parameter naming :)

    ReplyDelete

Creative Commons License This work by TechieBird is licensed under a Creative Commons Attribution-No Derivative Works 2.0 UK: England & Wales License.