Class CodeRay::Tokens
In: lib/coderay/tokens.rb
Parent: Array

Tokens TODO: Rewrite!

The Tokens class represents a list of tokens returnd from a Scanner.

A token is not a special object, just a two-element Array consisting of

A token looks like this:

  ['# It looks like this', :comment]
  ['3.1415926', :float]
  ['$^', :error]

Some scanners also yield sub-tokens, represented by special token actions, namely begin_group and end_group.

The Ruby scanner, for example, splits "a string" into:

 [
  [:begin_group, :string],
  ['"', :delimiter],
  ['a string', :content],
  ['"', :delimiter],
  [:end_group, :string]
 ]

Tokens is the interface between Scanners and Encoders: The input is split and saved into a Tokens object. The Encoder then builds the output from this object.

Thus, the syntax below becomes clear:

  CodeRay.scan('price = 2.59', :ruby).html
  # the Tokens object is here -------^

See how small it is? ;)

Tokens gives you the power to handle pre-scanned code very easily: You can convert it to a webpage, a YAML file, or dump it into a gzip‘ed string that you put in your DB.

It also allows you to generate tokens directly (without using a scanner), to load them from a file, and still use any Encoder that CodeRay provides.

Methods

Classes and Modules

Module CodeRay::Tokens::Undumping

External Aliases

push -> text_token
concat -> tokens

Attributes

scanner  [RW]  The Scanner instance that created the tokens.

Public Class methods

Undump the object using Marshal.load, then unzip it using GZip.gunzip.

The result is commonly a Tokens object, but this is not guaranteed.

[Source]

     # File lib/coderay/tokens.rb, line 201
201:     def Tokens.load dump
202:       dump = GZip.gunzip dump
203:       @dump = Marshal.load dump
204:     end

Public Instance methods

[Source]

     # File lib/coderay/tokens.rb, line 207
207:     def begin_group kind; push :begin_group, kind end

[Source]

     # File lib/coderay/tokens.rb, line 209
209:     def begin_line kind; push :begin_line, kind end

Return the actual number of tokens.

[Source]

     # File lib/coderay/tokens.rb, line 181
181:     def count
182:       size / 2
183:     end

Dumps the object into a String that can be saved in files or databases.

The dump is created with Marshal.dump; In addition, it is gzipped using GZip.gzip.

The returned String object includes Undumping so it has an undump method. See Tokens.load.

You can configure the level of compression, but the default value 7 should be what you want in most cases as it is a good compromise between speed and compression rate.

See GZip module.

[Source]

     # File lib/coderay/tokens.rb, line 174
174:     def dump gzip_level = 7
175:       dump = Marshal.dump self
176:       dump = GZip.gzip dump, gzip_level
177:       dump.extend Undumping
178:     end

Encode the tokens using encoder.

encoder can be

  • a symbol like :html oder :statistic
  • an Encoder class
  • an Encoder object

options are passed to the encoder.

[Source]

    # File lib/coderay/tokens.rb, line 66
66:     def encode encoder, options = {}
67:       encoder = Encoders[encoder].new options if encoder.respond_to? :to_sym
68:       encoder.encode_tokens self, options
69:     end

[Source]

     # File lib/coderay/tokens.rb, line 208
208:     def end_group kind; push :end_group, kind end

[Source]

     # File lib/coderay/tokens.rb, line 210
210:     def end_line kind; push :end_line, kind end

Redirects unknown methods to encoder calls.

For example, if you call +tokens.html+, the HTML encoder is used to highlight the tokens.

[Source]

    # File lib/coderay/tokens.rb, line 80
80:     def method_missing meth, options = {}
81:       encode meth, options
82:     rescue PluginHost::PluginNotFound
83:       super
84:     end

Split the tokens into parts of the given sizes.

The result will be an Array of Tokens objects. The parts have the text size specified by the parameter. In addition, each part closes all opened tokens. This is useful to insert tokens betweem them.

This method is used by @Scanner#tokenize@ when called with an Array of source strings. The Diff encoder uses it for inline highlighting.

[Source]

     # File lib/coderay/tokens.rb, line 95
 95:     def split_into_parts *sizes
 96:       parts = []
 97:       opened = []
 98:       content = nil
 99:       part = Tokens.new
100:       part_size = 0
101:       size = sizes.first
102:       i = 0
103:       for item in self
104:         case content
105:         when nil
106:           content = item
107:         when String
108:           if size && part_size + content.size > size  # token must be cut
109:             if part_size < size  # some part of the token goes into this part
110:               content = content.dup  # content may no be safe to change
111:               part << content.slice!(0, size - part_size) << item
112:             end
113:             # close all open groups and lines...
114:             closing = opened.reverse.flatten.map do |content_or_kind|
115:               case content_or_kind
116:               when :begin_group
117:                 :end_group
118:               when :begin_line
119:                 :end_line
120:               else
121:                 content_or_kind
122:               end
123:             end
124:             part.concat closing
125:             begin
126:               parts << part
127:               part = Tokens.new
128:               size = sizes[i += 1]
129:             end until size.nil? || size > 0
130:             # ...and open them again.
131:             part.concat opened.flatten
132:             part_size = 0
133:             redo unless content.empty?
134:           else
135:             part << content << item
136:             part_size += content.size
137:           end
138:           content = nil
139:         when Symbol
140:           case content
141:           when :begin_group, :begin_line
142:             opened << [content, item]
143:           when :end_group, :end_line
144:             opened.pop
145:           else
146:             raise ArgumentError, 'Unknown token action: %p, kind = %p' % [content, item]
147:           end
148:           part << content << item
149:           content = nil
150:         else
151:           raise ArgumentError, 'Token input junk: %p, kind = %p' % [content, item]
152:         end
153:       end
154:       parts << part
155:       parts << Tokens.new while parts.size < sizes.size
156:       parts
157:     end

Turn tokens into a string by concatenating them.

[Source]

    # File lib/coderay/tokens.rb, line 72
72:     def to_s
73:       encode CodeRay::Encoders::Encoder.new
74:     end

[Validate]