The TextReader
class¶
-
class
ferenda.
TextReader
(filename=None, encoding=None, string=None, linesep=None)[source]¶ Fancy file-like-class for reading (not writing) text files by line, paragraph, page or any other user-defined unit of text, with support for peeking ahead and looking backwards. It can read files with byte streams using different encodings, but converts/handles everything to real strings (unicode in python 2). Alternatively, it can be initialized from an existing string.
Parameters: -
UNIX
= '\n'¶ Unix line endings, for use with the
linesep
parameter.
-
DOS
= '\r\n'¶ Dos/Windows line endings, for use with the
linesep
parameter.
-
MAC
= '\r'¶ Old-style Mac line endings, for use with the
linesep
parameter.
-
cue
(string)[source]¶ Set seek position at the beginning of string, starting at current seek position. Raises IOError if string not found.
-
cuepast
(string)[source]¶ Set seek position at the beginning of string, starting at current seek position. Raises IOError if string not found.
-
readto
(string)[source]¶ Read and return all text between current seek potition and string. Sets new seek position at the start of string. Raises IOError if string not found.
-
readparagraph
()[source]¶ Reads and returns the next paragraph (all text up to two or more consecutive line separators).
-
lastread
()[source]¶ Returns the last chunk of data that was actually read (i.e. the
peek*
andprev*
methods do not affect this)
-
peekline
(times=1)[source]¶ Works like
readline()
, but does not affect current seek position. If times is specified, peeks that many lines ahead.
-
peekparagraph
(times=1)[source]¶ Works like
readparagraph()
, but does not affect current seek position. If times is specified, peeks that many paragraphs ahead.
-
peekchunk
(delimiter, times=1)[source]¶ Works like
readchunk()
, but does not affect current seek position. If times is specified, peeks that many chunks ahead.
-
prev
(size=0)[source]¶ Works like
read()
, but reads backwards from current seek position, and does not affect it.
-
prevline
(times=1)[source]¶ Works like
readline()
, but reads backwards from current seek position, and does not affect it. If times is specified, reads the line that many times back.
-
prevparagraph
(times=1)[source]¶ Works like
readparagraph()
, but reads backwards from current seek position, and does not affect it. If times is specified, reads the paragraph that many times back.
-
prevchunk
(delimiter, times=1)[source]¶ Works like
readchunk()
, but reads backwards from current seek position, and does not affect it. If times is specified, reads the chunk that many times back.
-
getreader
(callableObj, *args, **kwargs)[source]¶ Enables you to treat the result of any single
read*
,peek*
orprev*
methods as a new TextReader. Particularly useful to process individual pages in page-oriented documents:filereader = TextReader("rfc822.txt") firstpagereader = filereader.getreader(filereader.readpage) # firstpagereader is now a standalone TextReader that only # contains the first page of text from rfc822.txt filereader.seek(0) # reset current seek position page5reader = filereader.getreader(filereader.peekpage, times=5) # page5reader now contains the 5th page of text from rfc822.txt
-
getiterator
(callableObj, *args, **kwargs)[source]¶ Returns an iterator:
filereader = TextReader(“dashed.txt”) # dashed.txt contains paragraphs separated by “—-” for para in filereader.getiterator(filereader.readchunk, “—-“):
print(para)
-
flush
()[source]¶ See
io.IOBase.flush()
. This is a no-op.
-
read
(size=0)[source]¶ See
io.TextIOBase.read()
.
-
seek
(offset, whence=0)[source]¶ See
io.TextIOBase.seek()
.Note
The
whence
parameter is not supported.
-
tell
()[source]¶ See
io.TextIOBase.tell()
.
-
next
()¶ Backwards-compatibility alias for iterating over a file in python 2. Use
getiterator()
to make iteration work over anything other than lines (eg paragraphs, pages, etc).
-