Package String
Just like Java or C++ there are standard libraries (packages) available with golang. The one that you might end up using the most often is strings
.
For this blog, I was writing another practice exercise that is meant to count number of words in an input string.
Definition of Word
- A number composed of one or more ASCII digits (ie “0” or “1234”) OR
- A simple word composed of one or more ASCII letters (ie “a” or “they”) OR
- A contraction of two simple words joined by a single apostrophe (ie “it’s” or “they’re”)
Package : Strings
The package string exposes utility functions to manipulate UTF-8 encoded strings.
Strings are value types and immutable, which means that once created, you cannot modify the contents of the string. In other words, strings are immutable arrays of bytes.
— Official GoDocs
Solution
package wordcount
import (
"strings"
"unicode"
)
// Frequency of the words
type Frequency map[string]int
// WordCount Count the words occurrence.
func WordCount(in string) Frequency {
var myMap Frequency = make(map[string]int, 0)
in = strings.ToLower(in)
inStrings := strings.FieldsFunc(in, func(c rune) bool {
return c == ' ' || c == '\n' || c == '\t' || c == ','
})
for _, val := range inStrings {
val = strings.TrimFunc(val, func(r rune) bool {
return !unicode.IsLetter(r) && !unicode.IsDigit(r)
})
myMap[val]++
}
return myMap
}
Functions used
- FieldsFunc
- TrimFunc
FieldsFunc
More details here.
func FieldsFunc(s string, f func(rune) bool) []string
FieldsFunc splits the string s at each run of Unicode code points c satisfying f(c) and returns an array of slices of s. If all code points in s satisfy f(c) or the string is empty, an empty slice is returned. FieldsFunc makes no guarantees about the order in which it calls f(c). If f does not return consistent results for a given c, FieldsFunc may crash.
golang official documentation
func(c rune) bool {
return c == ' ' || c == '\n' || c == '\t' || c == ','
})
The function FieldFunc is supplied with an input string s
and a function that runs through the runes (unicode point) and returns the split if the condition evaluates to be true.
Example
in > testing, 1, 2 testing
The function will split it as under
{"testing", "1", "2", "testing"}
TrimFunc
More details here.
func TrimFunc(s string, f func(rune) bool) string
TrimFunc returns a slice of the string s with all leading and trailing Unicode code points c satisfying f(c) removed.
golang documentation
This function trims down the unwanted characters in the string.
func(r rune) bool {
return !unicode.IsLetter(r) && !unicode.IsDigit(r)
}
Essentially we remove everything that is not letter or digit form the string.
Example
in> car: carpet as java: javascript!!&@$%^&
It will strip car:
to car
likewise javascript
will be stripped of the unwanted characters in the end.