The REXMLReader is the 'default' parser, since we can at least be assured that REXML is probably there. It uses REXML's PullParser to handle larger document sizes without consuming insane amounts of memory, but it's still REXML (read: slow), so it's a good idea to use an alternative parser if available. If you don't know the best parser available, you can use the MagicReader or set:
or
MARC::XMLReader.parser="magic"
or
reader = MARC::XMLReader.new(fh, :parser=>"magic") (or the constant)
which will cascade down to REXML if nothing better is found.
# File lib/marc/xml_parsers.rb, line 155 def self.extended(receiver) require 'rexml/document' require 'rexml/parsers/pullparser' receiver.init end
Loop through the MARC records in the XML document
# File lib/marc/xml_parsers.rb, line 167 def each while @parser.has_next? event = @parser.pull # if it's the start of a record element if event.start_element? and strip_ns(event[0]) == 'record' yield build_record end end end
Sets our parser
# File lib/marc/xml_parsers.rb, line 162 def init @parser = REXML::Parsers::PullParser.new(@handle) end