Playing with Regex
Notes on Le Wagon Regular Expressions Anagrams exercise
For this exercise we want to write a method that will check if two strings are anagrams.
def anagrams?(a_string, another_string)
a_string = a_string.downcase.chars.sort.join.gsub(/\s+/, "").gsub(/[^0-9a-zA-Z]/, "")
another_string = another_string.downcase.chars.sort.join.gsub(/\s+/, "").gsub(/[^0-9a-zA-Z]/, "")
return a_string == another_string
end
For this first method we start out with creating two variables. These are called a_string
and another_string
respectivley. We apply the same logic to both of the given arguments.
We start by calling:
-
.downcase
which will change all uppercase letters to lowercase -
.chars
which will return an array of characters from the string. In other words, splits up the word into seperate letters -
.sort
which will sort the letters into alphabetical order -
.join
which will join the seperate letters together.[ "a", "b", "c" ].join #=> "abc"
-
.gsub
will sub out the characters given as the first argument and replace them with the characters given in the second argument. In our example we will sub out any number of whitespace characters and replace them with nothing. This is denoted by the\s
standing for whitespace character and+
which means that there can be one or more whitespaces. We also call.gsub
again and this time we say to sub out any character except numbers 0-9, lowercase letters a-z and uppercased letters A-Z. The^
denotes the exception.
Now we can return true if a_string
is equal to another_string
To demonstrate how this works lets take the example of the word POST and the word SPOT
POST -> ['p', 'o', 's', 't'] -> ['o', 'p', 's', 't'] -> "opst" SPOT -> ['s', 'p', 'o', 't'] -> ['o', 'p', 's', 't'] -> "opst"
This is am anagram because both results have returned in the same order.
What if we wanted to make our method faster by improving its Time Complexity? (Time taken to complete a method)
To do this we might want to use a hash
. Lets create a method that will create our hash which we can use to store data inside.
def create_hash(word)
word = word.downcase.chars.sort.join.gsub(/\s+/, "").gsub(/[^0-9a-zA-Z]/, "")
new_hash = {}
word.each_char do |char|
if new_hash.key?(char)
new_hash[char] += 1
else
new_hash[char] = 1
end
end
return new_hash.values
end
This array takes one argument of word
. We can then create a new variable which will be equivilent to downcasing the word and substituting any characters other than letters and numbers. We also set our new_hash
to empty {}
.
-
Iterate over
word
with.each_char
. This is just going to pass each character of the word to the block. -
Then we say if our
new_hash
contains the key ofchar
we can add 1 to this value stored in ournew_hash
-
Else, we create a new key of
char
in our hash.
Finally we return the new_hash.values
. This gives us a new array which is populated by groups of values from the hash.
We need to modify our first method so that we can take advantage of the hash we just created.
def anagrams_on_steroids?(a_string, another_string)
a_hash = create_hash(a_string)
another_hash = create_hash(another_string)
return a_hash == another_hash
end
Here, we are setting up two variables and calling our create_hash
method inside each of them. We give it the required arguments of a_string
and another_string
respectively. We then return true or false depending on if the a_hash
and another_hash
variables are the same.
The flow of this can be confusing so to recap:
-
If we give the word ‘POST’ as an argument into
create_hash
then as its run for the first time it will reach theif
statement and will not be able to find a key ofpost
as the hash is currently empty. It will therefore create a new key ofopst
. However whencreate_hash
is run a second time with the wordSPOT
then it will see that there is in fact a key of opst in the hash and it therfore increments its value by 1. -
When
new_hash.values
is called then it will display an array with the group of values.