Study Notes

存档

2006年2月的存档

sort

2006年2月11日 Phillip 没有评论

Sort, sort lines of text files.
syntax:
sort -c -m -u -o output-file [other options] +pos1.+pos2 input_files
-c : to check the text has been sorted
-m : combine two sorted files
-u : delete all duplicate characters to keep contents unique.
-o : write result to a file instead of standard output
-t : the default seperator is space, but we can also identify other characters by using “-t”.
-r : revers the sort sequence.
+pos1 : indicate the fields which will be sorted
+pos2 : indicate the character position which will be sorted
Note, the fieid and characters’ sequence are not as same as they are described in RE.These sequence account from 0 not 1.for example:
My name is Phillip Huang
Fields 0 1 2 3 4

P h i l l i p
Character 0 1 2 3 4 5 6

pos usage:
# sort +field_number.characters_number
Then ,the command line “# sort +4.6 input-file” means sort operation will be start from 5th fields, 7th character.

If we only execute the following command “# sort filename” without any options and specified fields or characters, sort will make the field 0 as default automatically.

sort by numeric field:
# sort -t: +3n video.txt
if “n” character is not used here, the result would be like following:
1789
2562
400
In fact, the output is not what we want because it is sorted by the first number of the numberic string.What will be if the “n”
character added?
# sort -t +3n video.txt
400
1789
2562
This operation sorted with the whole numeric string gives the right answer.

By using “-k” option, it is impossible for us to identify the first partition as field 1 just as RE. This way is more easily used which should not confuse us.for example:
# sort -t: -k4 video.txt

分类: 科技标签:

sed

2006年2月10日 Phillip 没有评论

Sed is a stream editor which modify the copy of original file.Of course we could re-direct the output file to a named result.
Sed syntax: # sed sed_command input_file
the following option are ofen used with sed:
-n : suppress automatic printing and pattern space

There are two ways to query text by using sed: line number and Regular expressions.
print syntax: [address[,address] p
eg: print the 2nd line
# sed -n ‘2p’ test.txt
print the 2nd and 3rd lines
# sed -n ‘2,3p’ test.txt
print the line which match the identified strings:
# sed -n ‘/Yellow/’p test.txt
or # sed -n ‘/Yellow/p’ test.txt
Using mode and line number to query:
line_number,/pattern/ (, is comma)
# sed -n ‘4,/The/’p test.txt
print the last line:
# sed -n ‘$’p test.txt
delete line by using “d”
# sed ‘1d’ test.txt
write the output file created by sed to a new file:
# sed ‘1,2 w new_file’ test.txt

How to know if there are controlling characters in the text? if issue the “# cat -v filename”, the system beeps and there are many stranger characters displaying in the screen, it means the text file has controlling characters.

分类: 科技标签:

gawk

2006年2月9日 Phillip 没有评论

gawk: pattern scanning and processing language.
There are three method using gawk command.
1. general syntax:
gawk [-F field-separator] ‘commands’ input-files
The space character is default seperator, and we could identify other character such as “:”. For example,
# gawk -F: ‘commands’ input-file
2.create file include gawk commands,and then issue the gawk command with this file.
# gawk -f awk-script-file input-files

gawk has two partitions:mode and action.The mode could be any if statement,complex statement and Regular expression.When we want to make the output file more readable.the key words “BEGIN” and “END” are included.The action is describled by using ｛｝.

Field and recorder:
The lines read by gawk is seperated into fieids which following the sequence:$1,$2…$n.The comma acts as the seperator between the field numbers.”$0″ means all the fields.
In order to print sepecified field or all fields,the “print” command is needed.This command strings should be included in ().

Examples:
Phillip Cocu is DMC.
Field Field1 Field2 Field3 Field4

# cat test1
Cocu Barcelone 1972
Keane Man.Utd 1974
# gawk ‘{print $1}’ test1
Cocu
Keane
# gawk ‘{print $1,$3}’ test1
Cocu 1972
Keane 1974
# gawk ‘{print $0}’ test1
Cocu Barcelona 1972
Keane Man.Utd 1974

tee could be used with gawk to type the result and save it to an output file.
# gawk ‘{print $1,$3}’ grade.txt | tee result.out

BEGIN and END key words make the outfile of gawk more readable.
# gawk ‘BEGIN {print “this is beginning”}{print $1″t”$4}’ file
This command line prints “this is beginning ” string before the normal output which does not include “BEGIN” key word.It is similar that END is able to print meaningful messages at final.

The following item would be attentioned:
1.insuring all the gawk command be included with single quotations.
2.the quotations in the command shoule be pairs.
3.{} contains the whole commands, and in the {},if statement must be included with ( ).

The operators of gawk are listed here:
<, <=, ==, !=, >=, ~, !~
Note,~ means the output file must be matched with the identified regular expression.For example,
# gawk ‘{if($4 ~ /Brown/) print $0}’ test.txt
the regular expression is included with pairs of back slash”/”.
# gawk ‘{if($4==”48″) print $0}’ test.txt

It’s possible to use variables insteading of field number.due to this way,we could identify and change the gawk command easily.for example,
# gawk ‘{name=$1;belt=$4;if(belts ~ /Yellow/) print name “is belt” belts}’ test.txt
gawk is not able to modify the original file, it only operates on a copy.
# gawk ‘BEGIN {print “name,string”} {if($6>$5) print $1,$3} END {print “end of result”}’ test.txt
# gawk ‘{if ($1==”J Troll”)($1=”J.L.Troll”);print $1}’ test.txt

Count function:
# gawk ‘/^[^d]/ {print $9,$5}{tot+=$5} END {print “total KB:” tot}’ test.txt

To find which partition capacity is more than “56000KB”,the following command could be issued:
# df -k | gawk ‘{$4~/^[0-9]/}{if($4>TRIGGER) print $6″t”$4}’ TRIGGER=56000
#df -k
Filesystem 1024-blocks Used Free %Used Mountpoint
—————————————————————————————-
Column 1 2 3 4 5 6

分类: 科技标签:

grep

2006年2月8日 Phillip 没有评论

What is grep? grep is using to print lines matching a pattern.
grep syntax:
grep [options] pattern [file]
In the syntax, the pattern would be RE and strings.If the string was used,there would be double quotation marks.For including double quotations around string, we could not be confused by the string which would be considered as SHELL command if there are not quotations,and the string such as “get up” including backspace would also be recognized. When there are variables, the quotations should not be missed.

some option paramters:
-c : only print the count of matching lines per file.
-i : ignore case distinctions
-n : print line number of out lines.
-s : suppress error messages
-v : select non-matched lines.

“>” is used to accuracy matching.This is very important.

+E option supports the “AND” and “OR” matching patterns.For example,if we would to find 219 or 216, the following command is used: # grep +E ‘219|216’ testfile
Note,if there is not single quotations,just like “grep +E 219|216 testfile”.This string is considered as SHELL command, and the error messages were displayed(in Mandriva):
# grep +E 219|216 testfile
bash: 234: command not found.

Some useful commands about grep:
# ls -l | grep ‘^d’
# grep ‘username’ /etc/passwd
# ps -ax | grep “named”
This command is used to identify if the DNS server (named) had been running.
# STR = “Mary Joe Peter Paulin”
# echo $ STR | grep “Mary”
output file: Mary Joe Peter Paulin

egrep: extended grep, it is alwayed used with “-f ” option.

分类: 科技标签:

Regular Expressions

2006年2月8日 Phillip 没有评论

The Regular Expressions is collection which include some regular or non regular characters.The following characters are listed here:
^ : only match the first character of line
$ : only match the last character of line
* : match all of the following strings
[] : match some characters, they would be declared in an identified range.For example, [0-9] means it would be match with any number from 0 to 9.
: used to specify the regular character. eg, “$” means “$” in RE.
. : match only one character.
pattern{n} : matched the lines in which pattern’s number is n.
pattern{n,}m : the least time would be m.
pattern{n,m} : match times between “n” and “m”.
eg:
dd-mm-yyyy : [0-9]{2}-[0-9]{4}-[0-9][4]
ip address: xxx.xxx.xxx.xxx
[0-9]{3}.[0-9]{3}.[0-9]{3}.[0-9]{3}

The difference between good codes and wonderful codes is depended on using RE rightly.

分类: 科技标签:

Schedule

2006年2月4日 Phillip 没有评论

First, I think I should read more and more books to improve my ability.These books contains Economy,Computer(Hardware and Database) and some cooking documents. The reap from books is more important than the books’ number which have been read. Less time should be wasted in play games such as PES5 and FM2006 through they are so interesting. Reading makes me brightly and seeing the scene of beautiful world.Second, my oral English must get more practice. I’ve asked my litter sister,Apple to help me.We planed to do some reading and listening practise in morning.I think one hour is available. This plan needs our patience and insistence. In my spare time, some Thailand boxing will also be arranged.

分类: 科技标签:

watch supernatural episodes WordPress 主题站 watch hannahmontana episodes fastxbox360repair

存档

sort

sed

gawk

grep

Regular Expressions

Schedule

最新文章

分类条目

历史归档

Meta