Perl Regex syntax -
I want to use Perl to take the previously generated SPSS syntax file and format it for use in any R environment. Want to do
This is probably a very easy task for people familiar with Pearl and regex, but I'm stumbling.
As a step I have placed them for this Perl script:
- Read in SPSS file
- Find the appropriate amount of SPSS
- Returns a syntax on the command line or from a priority file.
SPSS value labels are the basic form of syntax:
... I do not care about a bunch of crap. .. ... Value Label / Gender 1 "M" 2 "F" / Objective 1 "Execution" 2 "Holiday" 3 "tiddlywinks" executed. ... resume nonsense ...
and the desired syntax I look like this:
gender & lt; - as.factor (gender, level = c (1,2), labels = c ("m", "f") ...
Here is the Perl script which I just Is written up to I have successfully read each line in the appropriate array. I have a general flow of what I want for the final print function, but I have to figure out how to print appropriate @ level and @Label Array for each @vars array.
#! To read / usr / bin / perl # in the command line, it is necessary to read logically (VARVAL, "append.txt"); @ Lines = & lt; VARVAL & gt; Close (VARVAL); # Read through each line and reject one variable, one value, or # I really want to read everything in between "price label" and "execute". # Maybe more regex is needed ... foreach (@lines) {if ($ _ = ~ / \ //) {# is a variable with anything, then /// $ _ tr / \ / Press / d; Push (@ waver, $ _)} alicef ($ _ = ~ / \ d /) {Push (@well, $ _) Any value having a number in the # line is a value}} # Levels to each @vals array These labels are forresh (@well) {@values = split (/ \ s + /, $ _); # A space to split on non-space in the first place, vunerable ... better weekly? Foreign Currency (@values) {if ($ _ = ~ / \ d /) {push (@ level, $ _); } And {push (@labels, $ _}}}} # get rid of the new line. # Should I do somewhere else? Chomp (@vars); Chomp (@levels); Chomp (@labels); # Need to tell when to stop adding @ level and amp; @labels while loop? Hash lookup? # Need to get rid of the last comma # output needs to be printed in file forward (@ wise) (print $ _. "& Lt; - as.factor (". $ _. "\ N \ t, level = C ("; foreach (@levels) {print $ _.", "} Print") \ n \ t, label = c ("; foreach (@labels) {print $ _", "}} print") \ N \ t) \ n ";}
And finally, here the sample output is given from the script because it currently runs:
Gender & lt; - as.factor (gender, level = c (1,2,1,2,3,), label = c ("m", "f", "biz", "action", "tid I need it only level 1,2 and label M and F. Thanks for the help!
This works for me:
#! / Usr / bin / env perl strict Use; warnings; my @ lines = & lt; data & gt;; my $ current_label = ''; my @ordered_labels; my information; for my $ line (@line) {if ($ line = ~ /^\/(.*)$/) {# slash $ current_label = $ 1; Starts with; Push @ordered_labels, $ current_label; the upcoming; } If (length $ current_label) {if ($ line = ~ / ^ (\ d) "(. *)" $ /) {$ Data {$ current_label} {$ 1} = $ 2; the upcoming; }}} For my $ label (@ordered_labels) {print "$ label <" - as.factor ($ label \ n "; print", level = c ("; print join" (',' map {$ _} Sort keys (';', 'map' '' '.' $ '$ {$ Label} sort keys {% data {$ label}}); print \ "" print ", label = c (" } {$ _}. ''}} Order key% {$ data {$ label}}); print ") \ n"; print ") \ n"; } __DATA__ ... a bunch of crap I do not care ... ... value label / gender 1 "m" 2 "f" / purpose 1 "business" 2 "leave" 3 "tiddlywinks" execute.
and yields:
gender & lt; - as.factor (gender, level = c (1,2), label = c ("m", "f")) aim & lt; - as.factor (object, level = c (1,2,3), label = c ("business", "vacation", "tiddlywinks"))
Comments
Post a Comment