Playing with Regex

Notes on Le Wagon Regular Expressions Anagrams exercise

For this exercise we want to write a method that will check if two strings are anagrams.

``````def anagrams?(a_string, another_string)
a_string = a_string.downcase.chars.sort.join.gsub(/\s+/, "").gsub(/[^0-9a-zA-Z]/, "")
another_string = another_string.downcase.chars.sort.join.gsub(/\s+/, "").gsub(/[^0-9a-zA-Z]/, "")
return a_string == another_string
end
``````

For this first method we start out with creating two variables. These are called `a_string` and `another_string` respectivley. We apply the same logic to both of the given arguments.

We start by calling:

• `.downcase` which will change all uppercase letters to lowercase

• `.chars` which will return an array of characters from the string. In other words, splits up the word into seperate letters

• `.sort` which will sort the letters into alphabetical order

• `.join` which will join the seperate letters together. `[ "a", "b", "c" ].join #=> "abc"`

• `.gsub` will sub out the characters given as the first argument and replace them with the characters given in the second argument. In our example we will sub out any number of whitespace characters and replace them with nothing. This is denoted by the `\s` standing for whitespace character and `+` which means that there can be one or more whitespaces. We also call `.gsub` again and this time we say to sub out any character except numbers 0-9, lowercase letters a-z and uppercased letters A-Z. The `^` denotes the exception.

Now we can return true if `a_string` is equal to `another_string`

To demonstrate how this works lets take the example of the word POST and the word SPOT

`POST -> ['p', 'o', 's', 't'] -> ['o', 'p', 's', 't'] -> "opst" SPOT -> ['s', 'p', 'o', 't'] -> ['o', 'p', 's', 't'] -> "opst"`

This is am anagram because both results have returned in the same order.

What if we wanted to make our method faster by improving its Time Complexity? (Time taken to complete a method)

To do this we might want to use a `hash`. Lets create a method that will create our hash which we can use to store data inside.

``````def create_hash(word)
word = word.downcase.chars.sort.join.gsub(/\s+/, "").gsub(/[^0-9a-zA-Z]/, "")
new_hash = {}
word.each_char do |char|
if new_hash.key?(char)
new_hash[char] += 1
else
new_hash[char] = 1
end
end
return new_hash.values
end
``````

This array takes one argument of `word`. We can then create a new variable which will be equivilent to downcasing the word and substituting any characters other than letters and numbers. We also set our `new_hash` to empty `{}`.

• Iterate over `word` with `.each_char`. This is just going to pass each character of the word to the block.

• Then we say if our `new_hash` contains the key of `char` we can add 1 to this value stored in our `new_hash`

• Else, we create a new key of `char` in our hash.

Finally we return the `new_hash.values`. This gives us a new array which is populated by groups of values from the hash.

We need to modify our first method so that we can take advantage of the hash we just created.

``````def anagrams_on_steroids?(a_string, another_string)
a_hash = create_hash(a_string)
another_hash = create_hash(another_string)
return a_hash == another_hash
end
``````

Here, we are setting up two variables and calling our `create_hash` method inside each of them. We give it the required arguments of `a_string` and `another_string` respectively. We then return true or false depending on if the `a_hash` and `another_hash` variables are the same.

The flow of this can be confusing so to recap:

• If we give the word ‘POST’ as an argument into `create_hash` then as its run for the first time it will reach the `if` statement and will not be able to find a key of `post` as the hash is currently empty. It will therefore create a new key of `opst`. However when `create_hash` is run a second time with the word `SPOT` then it will see that there is in fact a key of opst in the hash and it therfore increments its value by 1.

• When `new_hash.values` is called then it will display an array with the group of values.