shell - Extracting by keywords -
input file:
11 message1(num:1;name:"ee";job:aaffdfd); 12 message2(category:"dds";num:2;name:"dfdsf");
output:
11,1,ee,aaffdfd,"message1(num:1;name:"ee";job:aaffdfd)" 12,2,dfdsf,0,"message2(category:"dds";num:2;name:"dfds
this have tried
awk '{print $1}' all.txt > out1 awk '{ printf("\""); (i = 2; <= nf; i++) { printf("%s ", $i); } printf("\"\n") }' all.txt > out2 awk -f'name:"|";' '{print $2}' all.txt > out3 awk -f".*job:|;|)" '/classtype:/{print $2;next}{print 0}' all.txt > out4 awk -f".*num:|;|)" '{print $2}' all.txt > out5 paste out1 out2 out3 out4 out5 > final
columns of output file should in following way:
- first column - same first col of input file
- second column - number between num: , ;
- third column- string between name:" , ";
- fourth column- string between job: , ; if not present in line of input file,make 0 in output
- fifth column- second column till end of line
currently have fields extracted separately different files using different awk commands,then merging files paste command. possible single awk command or in more optimised way?
it's not pretty here's way can achieve desired output using gnu awk:
$ awk -v ofs=, '{sub(/;$/,""); print $1, gensub(/.*num:([0-9]+).*/,"\\1",1), gensub(/.*name:"([^"]+).*/,"\\1",1), (/job/?gensub(/.*job:([^;)]+).*/,"\\1",1):0), "\""$2"\""}' file 11,1,ee,aaffdfd,"message1(num:1;name:"ee";job:aaffdfd)" 12,2,dfdsf,0,"message2(category:"dds";num:2;name:"dfdsf")"
the output field separator ofs
set comma. sub
removes semicolon end of each line. gensub
used here extract parts of line you're interested in. returns result of each substitution. ternary operator used add 0
if no /job:/
matched on line. using default field separator, $2
contains after first number.
Comments
Post a Comment