The TextReader class¶
-
class
ferenda.TextReader(filename=None, encoding=None, string=None, linesep=None)[source]¶ Fancy file-like-class for reading (not writing) text files by line, paragraph, page or any other user-defined unit of text, with support for peeking ahead and looking backwards. It can read files with byte streams using different encodings, but converts/handles everything to real strings (unicode in python 2). Alternatively, it can be initialized from an existing string.
Parameters: -
UNIX= '\n'¶ Unix line endings, for use with the
linesepparameter.
-
DOS= '\r\n'¶ Dos/Windows line endings, for use with the
linesepparameter.
-
MAC= '\r'¶ Old-style Mac line endings, for use with the
linesepparameter.
-
cue(string)[source]¶ Set seek position at the beginning of string, starting at current seek position. Raises IOError if string not found.
-
cuepast(string)[source]¶ Set seek position at the beginning of string, starting at current seek position. Raises IOError if string not found.
-
readto(string)[source]¶ Read and return all text between current seek potition and string. Sets new seek position at the start of string. Raises IOError if string not found.
-
readparagraph()[source]¶ Reads and returns the next paragraph (all text up to two or more consecutive line separators).
-
lastread()[source]¶ Returns the last chunk of data that was actually read (i.e. the
peek*andprev*methods do not affect this)
-
peekline(times=1)[source]¶ Works like
readline(), but does not affect current seek position. If times is specified, peeks that many lines ahead.
-
peekparagraph(times=1)[source]¶ Works like
readparagraph(), but does not affect current seek position. If times is specified, peeks that many paragraphs ahead.
-
peekchunk(delimiter, times=1)[source]¶ Works like
readchunk(), but does not affect current seek position. If times is specified, peeks that many chunks ahead.
-
prev(size=0)[source]¶ Works like
read(), but reads backwards from current seek position, and does not affect it.
-
prevline(times=1)[source]¶ Works like
readline(), but reads backwards from current seek position, and does not affect it. If times is specified, reads the line that many times back.
-
prevparagraph(times=1)[source]¶ Works like
readparagraph(), but reads backwards from current seek position, and does not affect it. If times is specified, reads the paragraph that many times back.
-
prevchunk(delimiter, times=1)[source]¶ Works like
readchunk(), but reads backwards from current seek position, and does not affect it. If times is specified, reads the chunk that many times back.
-
getreader(callableObj, *args, **kwargs)[source]¶ Enables you to treat the result of any single
read*,peek*orprev*methods as a new TextReader. Particularly useful to process individual pages in page-oriented documents:filereader = TextReader("rfc822.txt") firstpagereader = filereader.getreader(filereader.readpage) # firstpagereader is now a standalone TextReader that only # contains the first page of text from rfc822.txt filereader.seek(0) # reset current seek position page5reader = filereader.getreader(filereader.peekpage, times=5) # page5reader now contains the 5th page of text from rfc822.txt
-
getiterator(callableObj, *args, **kwargs)[source]¶ Returns an iterator:
filereader = TextReader(“dashed.txt”) # dashed.txt contains paragraphs separated by “—-” for para in filereader.getiterator(filereader.readchunk, “—-“):
print(para)
-
flush()[source]¶ See
io.IOBase.flush(). This is a no-op.
-
read(size=0)[source]¶ See
io.TextIOBase.read().
-
seek(offset, whence=0)[source]¶ See
io.TextIOBase.seek().Note
The
whenceparameter is not supported.
-
tell()[source]¶ See
io.TextIOBase.tell().
-
next()¶ Backwards-compatibility alias for iterating over a file in python 2. Use
getiterator()to make iteration work over anything other than lines (eg paragraphs, pages, etc).
-