Shell编程之正则表达式(二)-创新互联
文本处理器
在 Linux/UNIX
系统中包含很多种文本处理器或文本编辑器,其中包括我们之前学习过的VIM
编辑器与 grep
等。而 grep、sed、awk
更是 shell
编程中经常用到的文本处理工具,被称之为 Shell
编程三剑客。
sed 工具
sed(Stream EDitor)
是一个强大而简单的文本解析转换工具,可以读取文本,并根据指定的条件对文本内容进行编辑(删除、替换、添加、移动等),最后输出所有行或者仅输出处理的某些行。sed
也可以在无交互的情况下实现相复杂的文本处理操作,被广泛应用于 Shell
脚本中,用以完成各种自动化处理任务。
sed
的工作流程主要包括读取、执行和显示三个过程- 读取:
sed
从输入流(文件、管道、标准输入)中读取一行内容并存储到临时的缓 冲区中(又称模式空间,pattern space
) - 执行:默认情况下,所有的
sed
命令都在模式空间中顺序地执行,除非指定了行的地址,否则sed
命令将会在所有的行上依次执行 - 显示:发送修改后的内容到输出流。再发送数据后,模式空间将会被清空
- 在所有的文件内容都被处理完成之前,上述过程将重复执行,直至所有内容被处理完
- 读取:
- 默认情况下,所有的 sed 命令都是在模式空间内执行的,因此输入的文件并不会发生任何变化,除非是用重定向存储输出
sed 命令常见用法
sed 命令有两种格式:
sed [选项] '操作' 参数 //“参数”是指操作的目标文件,当存在多个操作对象时用,文件之间用逗号“,”分隔 或 sed [选项] -f scriptfile 参数 // scriptfile 表示脚本文件,需要用“-f”选项指定
常见sed命令选项
-e
或--expression=
:表示用指定命令或者脚本来处理输入的文本文件-f
或--file=
:表示用指定的脚本文件来处理输入的文本文件-h
或--help
:显示帮助-n
、--quiet
或silent
:表示仅显示处理后的结果-i
:直接编辑文本文件
- “操作”用于指定对文件操作的动作行为,也就是sed 的命令。通常情况下是采用的“[n1[,n2]]”操作参数的格式。n1、n2 是可选的,不一定会存在,代表选择进行操作的行数,如操作需要在 5~20 行之间进行,则表示为“5,20 动作行为”。常见的操作包括以下几种
a
:增加,在当前行下面增加一行指定内容c
:替换,将选定行替换为指定内容d
:删除,删除选定的行i
:插入,在选定行上面插入一行指定内容p
:打印,如果同时指定行,表示打印指定行;如果不指定行,则表示打印所有内容;如果有非打印字符,则以ASCII
码输出。其通常与“-n”
选项一起使用s
:替换,替换指定字符y
:字符转换
示例
1)输出符合条件的文本(p 表示正常输出)
[root@localhost opt]# sed -n 'p' httpd.txt //输出文件所有内容,等同 cat httpd.txt
#
# This is the main Apache HTTP server configuration file. It contains the
# configuration directives that give the server its instructions.
# See for detailed information.
# In particular, see
#
# for a discussion of each configuration directive.
#
# Do NOT simply read the instructions in here without understanding
# what they do. They're here only as hints or reminders. If you are unsure
# consult the online docs. You have been warned.
#
# Configuration and logfile names: If the filenames you specify for many
# of the server's control files begin with "/" (or "drive:/" for Win32), the
# server will use that explicit path. If the filenames do *not* begin
# with "/", the value of ServerRoot is prepended -- so 'log/access_log'
# with ServerRoot set to '/www' will be interpreted by the
# server as '/www/log/access_log', where as '/log/access_log' will be
...//省略部分内容....
[root@localhost opt]# sed -n '3p' httpd.txt //输出第3行内容
# configuration directives that give the server its instructions.
[root@localhost opt]# sed -n '3,5p' httpd.txt //输出3~5行内容
# configuration directives that give the server its instructions.
# See for detailed information.
# In particular, see
[root@localhost opt]# sed -n 'p;n' httpd.txt //输出所有奇数行内容,n表示读入下一行资料
#
# configuration directives that give the server its instructions.
# In particular, see
# for a discussion of each configuration directive.
# Do NOT simply read the instructions in here without understanding
# consult the online docs. You have been warned.
# Configuration and logfile names: If the filenames you specify for many
# server will use that explicit path. If the filenames do *not* begin
# with ServerRoot set to '/www' will be interpreted by the
# interpreted as '/log/access_log'.
...//省略部分内容....
[root@localhost opt]# sed -n 'n;p' httpd.txt //输出所有偶数行
# This is the main Apache HTTP server configuration file. It contains the
# See for detailed information.
#
#
# what they do. They're here only as hints or reminders. If you are unsure
#
# of the server's control files begin with "/" (or "drive:/" for Win32), the
...//省略部分内容....
[root@localhost opt]# sed -n '1,5{p;n}' httpd.txt //输出1~5行之间的奇数行(1、3、5行)
#
# configuration directives that give the server its instructions.
# In particular, see
[root@localhost opt]# sed -n '350,${n;p}' httpd.txt //输出第350行至文件尾之间的偶数行
#
IncludeOptional conf.d/*.conf
short
wod
woooood
AxyzxyzC
//在执行此命令时,读取的第1行时第350行,读取的第二行是第351行,依次类推,所以输出的偶数行是文件的第351行、第353行直至文件结尾,其中包括空行
sed
命令结合正则表达式时,格式略有不同,正则表达式以“/”
包围
[root@localhost opt]# sed -n '/the/p' httpd.txt //输出包含the的行
# This is the main Apache HTTP server configuration file. It contains the
# configuration directives that give the server its instructions.
# Do NOT simply read the instructions in here without understanding
# what they do. They're here only as hints or reminders. If you are unsure
# consult the online docs. You have been warned.
# Configuration and logfile names: If the filenames you specify for many
# of the server's control files begin with "/" (or "drive:/" for Win32), the
# server will use that explicit path. If the filenames do *not* begin
# with "/", the value of ServerRoot is prepended -- so 'log/access_log'
# with ServerRoot set to '/www' will be interpreted by the
[root@localhost opt]# sed -n '4,/the/p' httpd.txt //输出从第4行至第一个包含the的行
# See for detailed information.
# In particular, see
#
# for a discussion of each configuration directive.
#
# Do NOT simply read the instructions in here without understanding
[root@localhost opt]# sed -n '/the/=' httpd.txt //输出包含the的所在行的行号,等号(=)用来输出行号
2
3
9
10
11
13
14
...//省略部分内容...
[root@localhost opt]# sed -n '/^sh/p' httpd.txt //输出以sh开头的行
shirt
short
[root@localhost opt]# sed -n '/[0-9]$/p' httpd.txt //输出以数字结尾的行
#Listen 12.34.56.78:80
Listen 80
#ServerName www.example.com:80
AddDefaultCharset UTF-8
[root@localhost opt]# sed -n '/\/p' httpd.txt //输出包含的单词wood的行,\<\>代表单词边界
wood
2)删除符合条件的文本(d)
- 下面命令中
nl
命令用于计算文件的行数,结合该命令可以更加直观地查看到命令执行的结果。
[root@localhost opt]# nl httpd.txt | sed '3d' //删除第3行
1 u
2 # This is the main Apache HTTP server configuration file. It contains the
4 # See for detailed information.
5 # In particular, see
6 #
7 # for a discussion of each configuration directive.
...//省略部分内容...
[root@localhost opt]# nl httpd.txt | sed '3,5d' //删除3~5行
1 u
2 # This is the main Apache HTTP server configuration file. It contains the
6 #
7 # for a discussion of each configuration directive.
...//省略部分内容...
[root@localhost opt]# nl httpd.txt | sed '/wood/d' //删除包含wood的行,原第321行被删除
1 u //删除不包含cross 的行,用!符号表示取反操作,如'/cross/!d'
2 # This is the main Apache HTTP server configuration file. It contains the
3 # configuration directives that give the server its instructions.
...//省略部分内容...
318 short
319 wd
320 wod
322 woooood
323 AxyzC
324 AxyzxyzC
[root@localhost opt]# sed '/^[a-z]/d' httpd.txt //删除以小写字母开头的行
# This is the main Apache HTTP server configuration file. It contains the
# configuration directives that give the server its instructions.
...//省略部分内容...
# Load config files in the "/etc/httpd/conf.d" directory, if any.
IncludeOptional conf.d/*.conf
AxyzC
AxyzxyzC
[root@localhost opt]# sed '/\.$/d' httpd.txt //删除以“.”结尾的行
u
# This is the main Apache HTTP server configuration file. It contains the
# In particular, see
#
...//省略部分内容...
[root@localhost opt]# sed '/^$/d' httpd.txt //删除所有空行
u
# This is the main Apache HTTP server configuration file. It contains the
# configuration directives that give the server its instructions.
# See for detailed information.
# In particular, see
#
# for a discussion of each configuration directive.
//注意: 若是删除重复的空行,即连续的空行只保留一个, 执行“ sed –e ‘/^$/{n;/^$/d}’ httpd.txt”命令即可实现。其效果与“cat -s test.txt”相同,n 表示读下一行数据。
3)替换符合条件的文本
- 在使用
sed
命令进行替换操作时需要用到s
(字符串替换)、c
(整行/整块替换)、y
(字符转换)命令选项
[root@localhost opt]# sed 's/the/THE/' httpd.txt //将每行中第一个the替换为THE
u
# This is THE main Apache HTTP server configuration file. It contains the
# configuration directives that give THE server its instructions.
# See for detailed information.
# In particular, see
...//省略部分内容...
[root@localhost opt]# vim aaa.txt //编辑一个新文件
llllll
llllll
llllll //编辑内容
llllll
llllll
llllll
~
~
:wq //保存退出
[root@localhost opt]# sed 's/l/L/3' aaa.txt //将每行中第3个l替换为L
llLlll
llLlll //显示替换后内容
llLlll
llLlll
llLlll
llLlll
[root@localhost opt]# sed 's/the/THE/g' httpd.txt //将文件中多有的the替换为THE
u
# This is THE main Apache HTTP server configuration file. It contains THE
# configuration directives that give THE server its instructions.
# See for detailed information.
# In particular, see
...//省略部分内容...
[root@localhost opt]# sed 's/o//g' httpd.txt //将文件中所有的o删除(替换为空串)
u
# This is the main Apache HTTP server cnfiguratin file. It cntains the
# cnfiguratin directives that give the server its instructins.
...//省略部分内容...
shirt
shrt
wd
wd
wd
wd
AxyzC
AxyzxyzC
[root@localhost opt]# sed 's/^/#/' aaa.txt //在每行行首插入#号
#llllll
#llllll
#llllll
#llllll
#llllll
#llllll
[root@localhost opt]# sed 's/$/eof/' aaa.txt //在每行行尾插入字符串eof
lllllleof
lllllleof
lllllleof
lllllleof
lllllleof
lllllleof
[root@localhost opt]# vim aaa.txt //编辑文件
llllll
llllll
llllll
llllll
llllll
llllll
aaaaaa
aaaaaa //添加内容
aaaaaa
aaaaaa
aaaaaa
~
~
:wq //保存退出
[root@localhost opt]# sed '/a/s/^/#/' aaa.txt //在包含 a 的每行行首插入#号
llllll
llllll
llllll
llllll
llllll
llllll
#aaaaaa
#aaaaaa
#aaaaaa
#aaaaaa
#aaaaaa
[root@localhost opt]# sed '3,5s/l/L/g' aaa.txt //将第 3~5 行中的所有 l 替换为 L
llllll
llllll
LLLLLL
LLLLLL
LLLLLL
llllll
aaaaaa
aaaaaa
aaaaaa
aaaaaa
aaaaaa
[root@localhost opt]# vim bbb.txt //编辑一个新文件
this is
the wood
wood
wod //编辑内容
the wood
this is test
~
~
:wq //保存退出
[root@localhost opt]# sed '/the/s/o/O/g' bbb.txt //将包含 the 的所有行中的 o 都替换为 O
this is
the wOOd
wood
wod
the wOOd
this is test
4) 迁移符合条件的文本
H
,复制到剪贴板;g
、G
,将剪贴板中的数据覆盖/追加至指定行;w
,保存为文件;r
,读取指定文件;a
,追加指定内容。
[root@localhost opt]# sed '/the/{H;d};$G' bbb.txt //将包含the 的行迁移至文件末尾,{;}用于多个操作
this is
wood
wod
this is test
the wood
the wood
[root@localhost opt]# sed '1,3{H;d};8G' aaa.txt //将1~3行内容迁移到8行之后
llllll
llllll
llllll
aaaaaa
aaaaaa
llllll
llllll
llllll
aaaaaa
aaaaaa
aaaaaa
[root@localhost opt]# sed '/the/w ccc.txt' bbb.txt //将包含the 的行另存为文件ccc.txt
this is
the wood
wood
wod
the wood
this is test
[root@localhost opt]# cat ccc.txt //查看保存的文件内容
the wood
the wood
[root@localhost opt]# sed '/the/r /etc/hostname' bbb.txt
this is //将文件/etc/hostname 的内容添加到包含the 的每行以后
the wood
localhost.localdomain
wood
wod
the wood
localhost.localdomain
this is test
[root@localhost opt]# sed '3aNEW' bbb.txt //在第 3 行后插入一个新行,内容为 NEW
this is
the wood
wood
NEW
wod
the wood
this is test
[root@localhost opt]# sed '/the/aNEW' bbb.txt //在包含the 的每行后插入一个新行,内容为 NEW
this is
the wood
NEW
wood
wod
the wood
NEW
this is test
[root@localhost opt]# sed '3aNEW\nNEW2' bbb.txt //在第 3 行后插入多行内容,中间的\n 表示换行
this is
the wood
wood
NEW
NEW2
wod
the wood
this is test
5) 使用脚本编辑文件
- 使用
sed
脚本,将多个编辑指令存放到文件中(每行一条编辑指令),通过“-f”
选项来调用。
sed '1,3{H;d};6G' bbb.txt //将1~3行的内容迁移到6行之后
[root@localhost opt]# vim test
1,3H
1,3d
6G
~
:wq
[root@localhost opt]# sed -f test bbb.txt
wod
the wood
this is test
this is
the wood
wood
6) sed 直接操作文件示例
- 编写一个脚本,用来调整
vsftpd
服务配置:禁止匿名用户,但允许本地用户(也允许写入)。
[root@localhost ~]# **vim local_only_ftp.sh**
#!/bin/bash
#指定样本文件路径、配置文件路径
SAMPLE="/usr/share/doc/vsftpd-3.0.2/EXAMPLE/INTERNET_SITE/vsftpd.conf " CONFIG="/etc/vsftpd/vsftpd.conf"
#备份原来的配置文件,检测文件名为/etc/vsftpd/vsftpd.conf.bak 备份文件是否存在, 若不存在则使用 cp 命令进行文件备份
[ ! -e "$CONFIG.bak" ] && cp $CONFIG $CONFIG.bak //基于样本配置进行调整,覆盖现有文件
sed -e '/^anonymous_enable/s/YES/NO/g' $SAMPLE > $CONFIG
sed -i -e '/^local_enable/s/NO/YES/g' -e '/^write_enable/s/NO/YES/g' $CONFIG grep "listen" $CONFIG || sed -i '$alisten=YES' $CONFIG
# 启动vsftpd 服务,并设为开机后自动运行systemctl restart vsftpd
systemctl enable vsftpd
[root@localhost ~]# **chmod +x local_only_ftp.sh
另外有需要云服务器可以了解下创新互联cdcxhl.cn,海内外云服务器15元起步,三天无理由+7*72小时售后在线,公司持有idc许可证,提供“云服务器、裸金属服务器、高防服务器、香港服务器、美国服务器、虚拟主机、免备案服务器”等云主机租用服务以及企业上云的综合解决方案,具有“安全稳定、简单易用、服务可用性高、性价比高”等特点与优势,专为企业上云打造定制,能够满足用户丰富、多元化的应用场景需求。
本文标题:Shell编程之正则表达式(二)-创新互联
文章URL:http://scpingwu.com/article/dpchdj.html