2.1 Reading the Entire File
In Python, there are several ways to read text files. Each of these methods has its own advantages and is suitable for different situations. The main ways to read text files are: reading the entire content, reading line by line, reading a specified number of characters, and others.
The read()
method reads the entire contents of the file into one string.
file = open('example.txt', 'r')
content = file.read()
print(content)
file.close()
This is a very simple way to read a file—one method call and the entire content is in a string. There are downsides too—if the file contains, say, 200 MB of logs, it will read slowly, and your app's memory will quickly deplete.
2.2 Reading All Lines from a File
There's an alternative to the read()
method—it's the readlines()
method. It also reads the entire file into memory but returns it as a list of lines—each line of the file will be a separate line in the list.
The readlines()
method reads all lines from the file and returns them as a list of strings.
file = open('example.txt', 'r')
lines = file.readlines()
for line in lines:
print(line.strip()) # strip() removes extra spaces and newline characters
file.close()
This method can be convenient if you know in advance that you need to process the file contents line by line. The downside is that this method can consume a lot of memory for very large files since all lines are loaded into memory.
2.3 Iterating Over File Lines
The file
object has a built-in iterator, so you can iterate over its contents using a for
loop. This allows you to read the file line by line without loading the entire file into memory.
Example:
file = open('example.txt', 'r')
for line in file:
print(line.strip())
file.close()
This method is more memory efficient for large files since lines are read one at a time. However, it may be more challenging to handle if you need to go back to a previous line or change the reading order.
Let's compare this approach with the previous one:
Using Iterator | Using readlines() Function |
---|---|
|
|
This approach is simpler and faster. However, when you work on real projects, sometimes it may be quicker to load all data into memory and work with it there.
The readline()
and readlines()
methods are used for reading lines from a file, but they work differently. readline()
reads one line at a time, allowing control over the reading process without loading the entire file into memory. This is convenient when processing a file line by line or when the file is too large to load entirely into memory.
readlines()
, in contrast, reads all lines of a file at once and returns them as a list. This method is convenient if you need to quickly get all lines of a file for further processing. However, it consumes more memory, especially for large files, since the entire file is loaded into memory at once.
Depending on the task, readline()
may be preferable if memory saving and control over the reading process are important, while readlines()
is convenient when you need to get all lines of the file at once.
2.4 Reading Part of a File
If a file is too large, you can read it in parts. You can pass the parameter n
to the read(n)
method, which specifies the number of characters to read. If the file has fewer characters than n
, the read()
method will just read the file to the end.
Example:
file = open('example.txt', 'r')
content = file.read(10) # Reads the first 10 characters
print(content)
file.close()
This is convenient for reading large files in chunks or for processing fixed blocks of data. However, this approach ignores line division in the file—lines may be broken in the middle.
2.5 Reading a File Line by Line
If for some reason you don't want to use an iterator, you can manually read a file line by line. The file
object has a readline()
method. Don't confuse it with readlines()
.
The readline()
method reads one line from the file at a time.
Example:
file = open('example.txt', 'r')
line = file.readline()
while line:
print(line.strip())
line = file.readline()
file.close()
In this example, we are reading the file's contents line by line until the read line is empty.
GO TO FULL VERSION