Python files and IO

1. Documents and IO

  • The data stored in variables, sequences and objects is temporary, and will be lost after the program ends. In order to save the data in the program for a long time, the data in the program needs to be saved to the disk file.
  • Python provides built-in modules for operation of built-in file objects, files and directories. Through these technologies, data can be saved to files easily to achieve the purpose of saving data for a long time.

2. Basic operation of documents

  • The built-in file object in Python creates an open file object through the built-in open() method, and then performs some basic file operations through the methods provided by the object

2.1 creating and opening files

  • To operate a file in Python, you need to create or open the specified file and create a file object, which is implemented by the built-in open() function.
open() syntax:
file = open(filename[,mode[,buffering]])

Parameter Description:
File: created file object
 filename: to create or open a file name, use single or double quotation marks. For example, if the open file is in the same directory as the current directory, write the file name directly. Otherwise, write the full path
 Mode: optional parameter. The specified file opening mode is as follows
 buffering: used to specify the buffer mode for reading and writing files. A value of 0 indicates that the expression is not cached; a value of 1 indicates that the expression is cached; a value greater than 1 indicates the buffer size. The default cache mode

# Create a file. Normally, an error will be reported if no file exists when open()
# The parameter value of the specified mode is w,w+,a,a +, the open file does not exist, and a new file will be created automatically
file = open('yxy.txt','w')

# If the file is opened, an error will be reported if the file has been created
file1 = open('yxy.txt','r')
print(file1)
'''
//Output: < R io. Textiowrapper name ='yxy. TXT 'mode ='r' encoding ='cp936 '>
CP936 In fact, that is GBK,IBM Inventions Code Page Will be GBK It's on page 936, so it's called CP936
'''

# Open the specified encoding file. GBK encoding is used by default. If it is not GBK, an exception will be thrown
# Method 1: modify the file code directly
# Method 2: open() to open the file to specify the encoding format, recommended
file3 = open('yy.txt','w',encoding='utf-8')
print(file3)
'''
//Output: < io. Textiowrapper name ='yy. TXT 'mode ='w' encoding ='utf-8 '>
'''

2.2 closing documents

  • Open the file and close it in time to avoid unnecessary damage to the file. Close the file by using the close() method
  • The close() method first flushes the information that the buffer has not yet written, and then closes the file so that the contents that have not been written to the file can be written to the file. After closing the file, a write operation cannot be in progress.
# file.closed view the open and closed status of the file. False and True are displayed for closing
file4 = open('message.txt','w',encoding='utf-8')            # Create open file
print('Before closing',file4.closed)                                # Output: False
file4.close()                                               # Close file
print('After closing',file4.closed)                                 # Output: True

# IOError may be generated during file reading and writing. Once there is an error, file4.close() will not call
# Ensure that the file can be closed correctly no matter whether there is an error or not. Try finally can be used
try:
    file4 = open('message.txt', 'r', encoding='utf-8')
    print(file4)
    '''
    <_io.TextIOWrapper name='message.txt' mode='r' encoding='utf-8'>
    '''
    print('Before closing', file4.closed)                          # Output: False before closing
finally:
    file4.close()
    print('After closing', file4.closed)                          # Output: True after shutdown

2.3 using with statement to open a file

  • Open files, even if closed, and forget to close them can cause unexpected problems. Python provides with statement to realize file processing, whether to throw an exception or not, to ensure that the opened file is closed after the with statement is executed
with basic syntax: 
with expression as target:
      with-body

Parameter Description:
Expression: specify the expression, open the file open() function
 target: specifies a variable to which expression results are saved
 With body: Specifies the with statement body, and directly uses the pass statement instead
with open('yxy.txt','r') as wfile:
    print(wfile.closed)                      # Output: False
print(wfile.closed)                          # Output: True

2.4 reading files

  • After Python opens the file, it can write additional contents and read the contents of the file
read() syntax:
file.read([size]) Description: file opens the file object; the size optional parameter specifies the number of characters to be read, omitting to read the content at a time

Note: if you call the read() method to read the content, open the file and specify the open mode as r (read-only) or r + (read-write). Otherwise, an exception will be thrown
''io.UnsupportedOperation: not readable''
# Content of yxy.txt file
'''
11111111
22222222
33333333
44444444
'''
with open('yxy.txt','r') as wfile:
    string = wfile.read()
print(string)
'''
//Output:
11111111
22222222
33333333
44444444
'''

# Read a row
with open('yxy.txt','r') as wfile:
    str = wfile.readline()
    print(str)
'''
//Output: 11111111
'''

# Read all
with open('yxy.txt','r') as wfile:
    str1 = wfile.readlines()
    print(str1)
'''
//Output: ['11111111 \ n ','22222222 \ n','33333333 \ n ','444444']
'''

# When running, you can see that readlines() returns a string. If the file is large, the output of this method will be slow,
# You can output the contents of the list line by line
with open('yxy.txt','r') as wfile:
    str1 = wfile.readlines()
    for message in str1:
        print(message.rstrip())     # Self line break use rstrip() to delete the rightmost line break in two line breaks
'''
//Output:
11111111
22222222
33333333
44444444
'''

2.5 file write

  • Call the write() method to write content to the file. When opening the file, the specified opening mode is w (writable), or a (appending). Otherwise, io.unsupported operation: not writable will be thrown
  • If the file is opened in w mode, the previous file will be overwritten. Use caution. It is recommended to use a (append mode to open the write file), and write the file at the end of the file
with open('3.txt','a',encoding='utf-8') as fp:
    # Only one string can be written
    # fp.write("333")
    # Write list elements (must be strings) to a file
    data = ['111','222','333','444']
    # add linefeeds
    data = [line+'\n' for line in data]
    fp.writelines(data)
    # Refresh buffer [speed up the flow of data and ensure the smoothness of buffer]
    # close() closing the file also flushes the buffer
    fp.flush()

2.6 move file pointer

file.seek(offset[,whence])

Note: when using the seek() method, the value of offset is calculated as two characters for one Chinese character and one character for English and number
 Parameter Description:

File: indicates an open file object
 offset: the number of characters used for pointer movement, whose specific position is related to when
 When: the location of the file pointer, can be a parameter, and the value can be
 Set or 0 indicates the beginning position of the file, the default value
 Seek? Cur or 1 indicates the current position (not available)
SEEK_END or 2 means end position (cannot be used)
# 1.txt content: hello world
with open('1.txt','r',encoding='utf-8') as fp:
    # Move to space after hello
    fp.seek(5)
    print(fp.read(3))    # Output: wo
    # Move to start
    fp.seek(0)
    print(fp.read(5))    # Output: Hello
    # Show current pointer position
    print(fp.tell())     # Output 5

3. Directory operation

  • A directory is called a folder, which is used to store files hierarchically. Through the directory, files can be stored in different categories
  • os module is a Python built-in module related to operating system functions and file systems. The execution results of statements in this module are usually related to the operating system. Different operating systems may run different results
  • Common directory operations mainly include determining whether a directory exists, creating a directory, deleting a directory, traversing a directory, etc

3.1 os and os.path modules

  • In Python, the built-in OS module and its sub module os.path are used to operate the directory or file. The sub module os.path can also be used to import the OS module
import os

# Get the operating system type. nt represents the Windows operating system,
# posix stands for linux, Unix or Mac OS operating system
print(os.name)       # Output: nt

# Get current system line break
print(os.linesep)    # Output: '\ r\n' PyCharm does not appear, line feed is automatic

# Get the path separator used by the operating system
print(os.sep)        # Output:

3.2 path

  • A string that locates a file or directory is called a path. There are two paths for program development: one is relative path, the other is absolute path

3.2.1 relative path

  • Work path: refers to the directory of the current five years. In Python, the getcwd() function is provided through the os module to get the current working directory
  • In Python, to specify the file path, you need to transfer the "giant whale separator", i.e., "replace with" \ "in the path. In addition, you can also replace the" path separator "with" / ', or add the letter R (or R) before the path string, indicating that the separator in the path does not need to be escaped and can be used directly. "
import os
print(os.getcwd())   # Output: E: \ Qianfeng education \ day 24 file and directory \ code 
The relative path depends on the current working directory. For example, in the current working directory, there is a file named yxy.txt. Open the file to write the file name directly. If it is in the current working directory,
If there is a subdirectory demo in which the file yxy.txt is saved, open the file and write "demo/yxy.txt"
# Open the yxy.txt file directly
with open('yxy.txt','r') as rfile:
	pass

# Open the relative path demo/yxy.txt file
with open("demo/yxy.txt",'r') as rfile:
    pass

# Use r to open relative path
with open(r"demo\yxy.txt",'r') as rfile:
    pass

3.2.2 absolute path

Absolute path: the actual path of the file, independent of the current working directory

os.path.abspath(path)
#Get absolute path
import os
print(os.path.abspath(r"yxy.txt"))        
'''
Output: E: \ Qianfeng education \ day 24 file and directory \ code \ yxy.txt
'''

3.3.3 splicing path

  • If you want to combine two or more paths to form a new path, use the os.path module to provide the join() function
  • Using the os.path.join() function to splice a path does not detect whether the path actually exists
os.path.join(path1[,path2[,......]])
#Splicing path
import os
 print(os.path.join(r"E: \ Qianfeng education \ day 24 file and directory \ code", r"demo\message.txt"))
'''
Output: E: \ Qianfeng education \ day 24 file and directory \ code \ demo\message.txt
 Note: the path directory does not exist
'''
#Using the join() function, to splice paths, there are multiple absolute paths. The last occurrence from left to right will prevail. The parameters before the path will be ignored
 Print (OS. Path. Join (R'e: \ code ', R'c: \,'demo')) (output: C: \ \ \ demo)

3.4 determine whether the directory exists

  • In Python, sometimes you need to determine whether a given directory exists, and use the exists() function provided by the os.path module to implement it
os.path.exists(path)

Parameter Description:
Path: to determine the directory, you can use relative path or absolute path
 Return value: return True if directory exists, return False if no exists
import os
print(os.path.exists(r'E:\Qian Feng Education\Twenty-fourth days_Files and directories\Code\demo\message.txt'))  # Output: False
print(os.path.exists(r'E:\Qian Feng Education\Twenty-fourth days_Files and directories\Code'))                   # Output: True

3.5 create directory

  • In Python, the os module provides two functions to create a directory, one for creating a level-1 directory and the other for creating a level-1 directory

3.4.1 create a level 1 directory

  • Only one level directory can be created at a time. Use the os module to provide the mkdir() function
  • Only the last level directory can be specified to be created. If the previous level directory does not exist, a FileNotFoundError exception will be thrown
os.mkdir(path,mode=0o777)

Parameter Description:
Path: Specifies the directory to be created. You can use absolute path or relative path
 Mode: Specifies the numerical mode, the default value is 0777, which is invalid or ignored on non UNIX systems
# Create E:\demo directory
import os
if os.path.exists(r'E:\demo') == True:
    '''Create directory exists, execute mkdir The command will throw FileExistsError abnormal'''
    pass
else:
    print('E:\demo No created')
    os.mkdir(r'E:\demo')

3.4.2 create multi-level directory

  • To create a multi-level directory, use the os module to provide the makedirs() function, which creates the directory recursively
os.makedirs(name,mode=0o777)
# Create and create E:\demo   directory
import os
path = r'E:\demo\1\2\3\4'          # Specify create directory
if not os.path.exists(path):       # Determine whether the directory exists
    os.makedirs(path)              # Create directory
    print('Directory created successfully')
else:
    print('The directory already exists')

3.6 delete directory

  • Removing directory using os module to provide rmdir() function
  • When deleting a directory through the rmdir() function, it only works if the directory to be deleted is empty
 os.rmdir(path)
# Delete the 4 directories in the E:\demo   directory
# Run again to throw an exception: FileNotFoundError: [WinError 2] the system cannot find the specified file. : 'E:\demo\1\2\3\4'
# This indicates that the directory has been deleted. Only empty list can be deleted
import os
os.rmdir(r'E:\demo\1\2\3\4')
  • Using rmdir() function can only delete empty directory. If you want to delete non empty directory, you need to use rmtree() function of shutil, Python's built-in guarantee module
import shutil
shutil.rmtree(r'E:\demo')     # Delete the demo and the following directories and their contents

3.7 traverse directory

  • Browse all directories (including subdirectories) and files under the specified directory in Python
  • In Python, the walk() function in the os module is used to traverse the directory
os.walk(top[,topdown][,onerror][,followlinks])

Parameter Description:
top: Specifies the root directory to traverse the content
 topdown: optional parameter, used to specify the traversal order. If the value is True, it means traversal from top to bottom (i.e. traversing the root directory first);
         If the value is False, it means bottom-up traversal (i.e. traversing the last level subdirectory first), and the default value is True
 onerror: optional parameter, which specifies the error handling method. It is ignored by default. If you do not want to ignore it, you can also specify an error handling function, which is usually the default
 followlinks: optional parameter. By default, the walk() function does not convert down to a symbolic link that resolves to the directory,
             Set the parameter value to True to specify access to the directory specified by the symbolic link on the support system
 Return value: returns the generator object composed of three tuples (dirpaht,dirnames,filenames);
       dirpath: indicates the current traversal path, which is a string
       dirnames: indicates that the current path contains a list of subdirectories;
       filenames: indicates that the files contained in the current path are also a column
# Traverse all file directories of E disk
import  os

p = r'E:\\'
print('[',p,']Files and directories contained in the directory')
for root,dirs,files in os.walk(p,topdown = True):         # root directory dirs path filesw file traversal specified directory falls from Hassan
    for name in dirs:
        print(os.path.join(root,name))
    for name in files:
        print('\t',os.path.join(root,name))
'''
//Output:
E:\\Tomorrow Technology\Python Project development case\23\wechat_robot\.git\refs
	 E:\\Tomorrow Technology\Python Project development case\23\wechat_robot\.git\COMMIT_EDITMSG
	 E:\\Tomorrow Technology\Python Project development case\23\wechat_robot\.git\config
	 E:\\Tomorrow Technology\Python Project development case\23\wechat_robot\.git\description
	 E:\\Tomorrow Technology\Python Project development case\23\wechat_robot\.git\HEAD
	 E:\\Tomorrow Technology\Python Project development case\23\wechat_robot\.git\index
	 E:\\Tomorrow Technology\Python Project development case\23\wechat_robot\.git\hooks\applypatch-msg.sample
	 ...
'''

4. Advanced file operation

  • In addition to directory operation, Python built-in os module can also perform some advanced operations on files, as follows

4.1 deleting files

  • Python does not have a built-in delete file function. The delete file function remove() is provided in the built-in os module
os.remove(path)
# Delete yxy.txt, delete no file throw exception
# FileNotFoundError: [WinError 2] the system could not find the specified file. 'yxy.txt'
import os
os.remove('yxy.txt')

4.2 rename files and directories

  • The os module provides the function rename() to rename files and directories. If you specify a time path file, rename the file. If you specify a path directory, rename the directory
os.rename(src,dst)
# Duplicate name file is the same as directory
import os

src = r'E:\Qian Feng Education\Twenty-fourth days_Files and directories\Code\1.txt'   # File to rename
dst = r'E:\Qian Feng Education\Twenty-fourth days_Files and directories\Code\2.txt'   # Renamed file
if os.path.exists(src):                           # Judge whether the file exists
    os.rename(src,dst)                            # From naming
    print('File rename completed!')
else:
    print('file does not exist')

4.3 access to basic information of documents

  • After the computer creates the file, the file itself will contain some information, which can be obtained through the os module stat() function
os.stat(path)

import os - import os module
 if os.path.exists('2.txt '): 񖓿 judge whether the file exists
    fileinfo = os.stat('2.txt ') - get basic information of the file
    print('File full path ', os.path.abspath('2.txt')) 񖓿get the full path of the file
    print('File size: ', fileinfo.st'size, "bytes") ා output file basic information
    print('last modified time ', fileinfo.st_mtime)
'''
Output:
File full path E: \ Qianfeng education \ day 24 file and directory \ code \ 2.txt
 File size: 11 bytes
 Last modified 1584004049.6943874
'''
Published 21 original articles, won praise 3, visited 2049
Private letter follow

Tags: Python encoding git Unix

Posted on Thu, 12 Mar 2020 05:46:04 -0700 by SBukoski