nlp - Splitting a Domain name into constituent words (if possible)? -


I want to break a domain name into component terms and numbers like

iamadomain11.com = [ 'I', 'am', 'A', 'domain', '11']

How do I do this? I know that many sets may be possible, however, I am still fine, getting a set of just 1 possibilities.

This is actually resolved in the O'Reilly Media Book, in Chapter 14, "Natural Language Corpus data ", it creates a splitter, as you would like to do in a dragon using a huge freely available token frequency data set.


Comments

Popular posts from this blog

c# - sqlDecimal to decimal clr stored procedure Unable to cast object of type 'System.Data.SqlTypes.SqlDecimal' to type 'System.IConvertible' -

Calling GetGUIThreadInfo from Outlook VBA -

Obfuscating Python code? -