首页 > 英语园地

用python从符合一定格式的txt文档中逐行读取数据并按一定规则写入excel（open。。。

更新时间:2023-08-12 05:37:19 阅读：评论：0

⽤python从符合⼀定格式的txt⽂档中逐⾏读取数据并按⼀定规

则写⼊excel（open。。。

前⼏天接到⼀个任务，从gerrit上通过ssh命令获取⼀些commit相关的数据到⽂本⽂档中，随后将这些数据存⼊Excel中。数据格式如下图所⽰

观察上图可知，存在⽂本⽂档中的数据符合⼀定的格式，通过python读取、正则表达式处理并写⼊Excel⽂档将⼤⼤减少⼈⼯处理的⼯作量。

1. 从gerrit获取原始信息，存⼊⽂本⽂档：

$ssh –p 29418 <your-account>@192.168.1.16 gerrit query status:merged since:<date/7/days/ago> 2>&1 | tee merged_patch_

2. 从txt⽂档中读取数据。

Python的标准库中，⽂件对象提供了三个“读”⽅法： .read()、.readline() 和 .readlines()。每种⽅法可以接受⼀个变量以限制每次读取的数据量，但它们通常不使⽤变量。 .read() 每次读取整个⽂件，它通常⽤于将⽂件内容放到⼀个字符串变量中。然⽽ .read() ⽣成⽂件内容最直接的字符串表⽰，但对于连续的⾯向⾏的处理，它却是不必要的，并且如果⽂件⼤于可⽤内存，则不可能实现这种处理。

readline() 和 readlines()之间的差异是后者⼀次读取整个⽂件，象 .read()⼀样。.readlines()⾃动将⽂件内容分析成⼀个⾏的列表，该列表可以由 Python 的 in ... 结构进⾏处理。另⼀⽅⾯，.readline()每次只读取⼀⾏，通常⽐ .readlines()慢得多。仅当没有⾜够内存可以⼀次读取整个⽂件时，才应该使⽤.readline()。

patch_file_name="merged_patch_"

patch_file=open(patch_file_name,‘r‘) #打开⽂档，逐⾏读取数据

for line in open(patch_file_name):

line=adline()

print line

3. 写⼊到Excel⽂档中祖国在我心中演讲稿200字

python处理Excel的函数库中，xlrd、xlwt、xlutils⽐较常⽤，⽹上关于它们的资料也有很多。但由于它们都不⽀持Excel 2007以后的版本(.xlsx)，所以只能忍痛放弃。

经过⼀番搜索，找到了openpyxl这个函数库，它不仅⽀持Excel 2007，并且⼀直有⼈维护（当前最新版本为2.2.1，2015年3⽉31⽇发布）。官⽅的描述为:

安装⽅法(windows 7)：⾸先安装jdcal模块--解压缩到某⽬录，cd到该⽬录，运⾏"python tup.py install"。然后安装openpyxl,⽅法相同。

写⼊步骤如下：finish

1. 打开⼯作簿:

wb=load_workbook(‘Android_Patch_Review-Y2015.xlsx‘)

2. 获得⼯作表

sheetnames = wb.get_sheet_names()

ws = wb.get_sheet_by_name(sheetnames[2])

3. 将txt⽂档中的数据写⼊并设置单元格格式

patch_file_name="merged_patch_"

patch_file=open(patch_file_name,‘r‘) #打开⽂档，逐⾏读取数据

ft=Font(name=‘Neo Sans Intel‘,size=11)

for line in open(patch_file_name):

line=adline()

4. 保存⼯作簿

wb.save(‘Android_Patch_Review-Y2015.xlsx‘)

完整代码如下：

from openpyxl.workbook import Workbook

酬和

l import load_workbook

from openpyxl.styles import PatternFill, Border, Side, Alignment, Protection, Font

import re

#from l import ExcelWriter

#import xlrd

ft=Font(name=‘Neo Sans Intel‘,size=11) #define font style

bd=Border(left=Side(border_style=‘thin‘,color=‘00000000‘), right=Side(border_style=‘thin‘,color=‘00000000‘),

top=Side(border_style=‘thin‘,color=‘00000000‘), bottom=Side(border_style=‘thin‘,color=‘00000000‘)) #define border style

alg_cc=Alignment(horizontal=‘center‘, vertical=‘center‘, text_rotation=0, wrap_text=True, shrink_to_fit=True, indent=0) #define alignment styles

泥淖alg_cb=Alignment(horizontal=‘center‘, vertical=‘bottom‘, text_rotation=0, wrap_text=True, shrink_to_fit=True, indent=0)

alg_lc=Alignment(horizontal=‘left‘, vertical=‘center‘, text_rotation=0, wrap_text=True,

shrink_to_fit=True, indent=0)

patch_file_name="merged_patch_"

patch_file=open(patch_file_name,‘r‘) #get data patch text被就业

wb=load_workbook(‘Android_Patch_Review-Y2015.xlsx‘) #open excel to write

sheetnames = wb.get_sheet_names()

ws = wb.get_sheet_by_name(sheetnames[2]) #get sheet

rows=ws)

ll(row=rows,column=1).value!=None, ‘New Document or empty row at the end of the document? Plea input at least one row!‘print "The original Excel document has %d rows totally." %(rows)

end_tag=‘type: stats‘

for line in open(patch_file_name):

母亲节快乐英文怎么写

line=adline()

if re.match(end_tag,line) is not None: #end string

break

if len(line)==1: #go to next patch

rows=rows+1

毕业生薪酬榜continue

两会翻译line = line.strip()

# print line

foreigner

if re.match(‘change‘,line) is not None:

if re.match(‘url:‘,line) is not None:

acreage

if re.match(‘project:‘,line) is not None:

if re.match(‘branch:‘,line) is not None:

if re.match(‘lastUpdated:‘,line) is not None:

if re.match(‘commitMessage:‘,line) is not None:

description_str=re.sub(‘commitMessage:‘,‘‘,line)

if re.match(‘Product:|BugID:|Description:|Unit Test:|Change-Id:‘,line) is not None:

description_str=description_str+‘\n‘+line #

if re.match(‘Signed-off-by:‘,line) is not None:

description_str=description_str+‘\n‘+line

wb.save(‘Android_Patch_Review-Y2015.xlsx‘)

print ‘Android_Patch_Review-Y2015.xlsx saved!\nPatch Collection Done!‘

#patch_file.clo()

⽬前为⽌，基本功能已经实现，但是还有两个问题没有搞明⽩：

第⼀个是完整代码中的最后⼀句注释⾏，我搜到的⼏篇介绍openpyxl的博客中，打开⽂件后都没有clo，所以我在代码中也没有clo。理论上感觉还是需要的。等对⽂件对象的理解更加深⼊⼀些时会继续考虑这个问题。

第⼆是运⾏该脚本时有⼀个warning," UrWarning: Discarded range with rerved name，warnings.warn("Discarded range with rerved name")“，⽬前还在搜索原因，如有明⽩的，也请不吝告知。

本文发布于:2023-08-12 05:37:19，感谢您对本站的认可！

本文链接：https://www.wtabcd.cn/fanwen/fan/78/1129751.html

上一篇：CPTPP第9章投资

下一篇：Catia装配设计入门

标签：处理读取没有获取问题

留言与评论（共有 0 条评论）