We want to avoid saving files to disk if we use the file immediately. This is especially interesting for ml pipelines.
doing this also removes the need for file cleanups
We want to keep our files in memory and not on disk. We do this by using the BytesIO module.
from io import BytesIO
def generate_file_in_memory():
# Generate file content
file_content = "This is a sample text file content."
# Save to in-memory buffer
file_buffer = BytesIO()
# The encode() part is not always necessary,
# savefig from matplotlib does it automatically for example.
file_buffer.write(file_content.encode())
return file_buffer
file_buffer = generate_file_in_memory()
print(file_buffer.getvalue())
The seek(0) operation would move the file read to position zero. Any read of write operation starts automatically at zero. I don't see a reason to touch it, but you often times see seek(0) in code. If you have issues that might be worth a try.