linux - Lowercase all text except xml tags -


i've got large number of tagged strings:

watch <team>philly's</team> game what's on <time>wednesday night 8 o'clock</time> 

i lowercase text except xml tags. i.e.

watch <team>philly's</team> game what's on <time>wednesday night 8 o'clock</time> 

i can lower case text using awk:

awk '{print tolower($0)}' file.txt 

but have no idea how avoid xml tags. languages/tools welcome.

this sed (gnu) one-liner may help:

sed -r 's/([^<>]*)($|<)/\l\1\e\2/g' 

with example:

kent$ echo "watch <team>philly's</team> game what's on <time>wednesday night 8 o'clock</time>"|sed -r 's/([^<>]*)($|<)/\l\1\e\2/g'  watch <team>philly's</team> game what's on <time>wednesday night 8 o'clock</time> 

Comments

Popular posts from this blog

javascript - How to synchronize the Three.js and HTML/SVG coordinate systems (especially w.r.t. the y-axis)? -

javascript - How do I find how many occurences are there of a highlighted string, and which occurence is it? -

java - Reading data from multiple zip files and combining them to one -