Module CurryStringClassifier

The Curry string classifier is a simple tool to process strings containing Curry source code. The source string is classified into the following categories:

  • moduleHead - module interface, imports, operators
  • code - the part where the actual program is defined
  • big comment - parts enclosed in {- ... -}
  • small comment - from "--" to the end of a line
  • text - a string, i.e. text enclosed in "..."
  • letter - the given string is the representation of a character
  • meta - containing information for meta programming

For an example to use the state scanner cf. addtypes, the tool to add function types to a given program.

Author: Bernd Brassel

Version: April 2005

Summary of exported operations:

isSmallComment :: Token -> Bool   
test for category "SmallComment"
isBigComment :: Token -> Bool   
test for category "BigComment"
isComment :: Token -> Bool   
test if given token is a comment (big or small)
isText :: Token -> Bool   
test for category "Text" (String)
isLetter :: Token -> Bool   
test for category "Letter" (Char)
isCode :: Token -> Bool   
test for category "Code"
isModuleHead :: Token -> Bool   
test for category "ModuleHead", ie imports and operator declarations
isMeta :: Token -> Bool   
test for category "Meta", ie between {+ and +}
scan :: String -> [Token]   
Divides the given string into the six categories.
plainCode :: [Token] -> String   
Yields the program code without comments (but with the line breaks for small comments).
unscan :: [Token] -> String   
Inverse function of scan, i.e., unscan (scan x) = x.
readScan :: String -> IO [Token]   
return tokens for given filename
testScan :: String -> IO ()   
test whether (unscan .

Exported datatypes:


Token

The different categories to classify the source code.

Constructors:

  • SmallComment :: String -> Token
  • BigComment :: String -> Token
  • Text :: String -> Token
  • Letter :: String -> Token
  • Code :: String -> Token
  • ModuleHead :: String -> Token
  • Meta :: String -> Token

Tokens

Type synonym: Tokens = [Token]


Exported operations:

isSmallComment :: Token -> Bool   

test for category "SmallComment"

isBigComment :: Token -> Bool   

test for category "BigComment"

isComment :: Token -> Bool   

test if given token is a comment (big or small)

isText :: Token -> Bool   

test for category "Text" (String)

isLetter :: Token -> Bool   

test for category "Letter" (Char)

isCode :: Token -> Bool   

test for category "Code"

isModuleHead :: Token -> Bool   

test for category "ModuleHead", ie imports and operator declarations

isMeta :: Token -> Bool   

test for category "Meta", ie between {+ and +}

scan :: String -> [Token]   

Divides the given string into the six categories. For applications it is important to know whether a given part of code is at the beginning of a line or in the middle. The state scanner organizes the code in such a way that every string categorized as "Code" always starts in the middle of a line.

plainCode :: [Token] -> String   

Yields the program code without comments (but with the line breaks for small comments).

unscan :: [Token] -> String   

Inverse function of scan, i.e., unscan (scan x) = x. unscan is used to yield a program after changing the list of tokens.

readScan :: String -> IO [Token]   

return tokens for given filename

testScan :: String -> IO ()   

test whether (unscan . scan) is identity