Perl RegEx: Limiting the pattern to only the first occurrence of a character -


I am trying to extract the contents of a date element with many sick SGML documents. For example, the document might have a simple date element such as

   

or

  & lt; DATE blaAttrib = "89787adjd98d9" & gt; 4 July 1936 & lt; / DATE & gt;  

But it can also be in the form of hair:

  & lt; DATE blaAttrib = "89787adjd98d9" & gt; 4 July 1936 & lt; EM & gt; EM element within the date & lt; / EM & gt; & Lt; / DATE & gt; The purpose is to get "July 4, 1936" since the files are not large, so I read the entire contents in a variable and chose to regex. The following is a snippet of my Perl code:  
  {local $ / = undef; Open the file, "$ file" or die "File can not be opened: $!"; $ FileContent = & lt; FILE & gt; Close the file; If ($ fileContent = ~ m / & lt; DATE (. *) & Gt; (. *) & Lt; / DATE & gt; /) {# $ 2 should be "4 July 1936", but It did not happen. }}  

Unfortunately, regex does not work for hair examples. The reason for this is that & lt; DATE & gt; inside a & lt; EM & gt; element and it also spreads to many rows.

Can any kind of spirit give me some signs, instructions, or clues?

Thanksgiving!

.

But by your example, maybe you can try

  if ($ fileContent = ~ m / & lt; DATE [^ & gt;] * & gt; ([^ & Lt;] +) /) {# $ 1 Use # You may need to strip new lines}  

Comments

Popular posts from this blog

c# - sqlDecimal to decimal clr stored procedure Unable to cast object of type 'System.Data.SqlTypes.SqlDecimal' to type 'System.IConvertible' -

Calling GetGUIThreadInfo from Outlook VBA -

Obfuscating Python code? -