Module CurryStringClassifier

The Curry string classifier is a simple tool to process strings containing Curry source code. The source string is classified into the following categories:

  • moduleHead - module interface, imports, operators
  • code - the part where the actual program is defined
  • big comment - parts enclosed in {- ... -}
  • small comment - from "--" to the end of a line
  • text - a string, i.e. text enclosed in "..."
  • letter - the given string is the representation of a character
  • meta - containing information for meta programming

For an example to use the state scanner cf. addtypes, the tool to add function types to a given program.

Author: Bernd Brassel

Version: November 2020

Summary of exported operations:

isSmallComment :: Token -> Bool  Deterministic 
test for category "SmallComment"
isBigComment :: Token -> Bool  Deterministic 
test for category "BigComment"
isComment :: Token -> Bool  Deterministic 
test if given token is a comment (big or small)
isText :: Token -> Bool  Deterministic 
test for category "Text" (String)
isLetter :: Token -> Bool  Deterministic 
test for category "Letter" (Char)
isCode :: Token -> Bool  Deterministic 
test for category "Code"
isModuleHead :: Token -> Bool  Deterministic 
test for category "ModuleHead", ie imports and operator declarations
isMeta :: Token -> Bool  Deterministic 
test for category "Meta", ie between {+ and +}
scan :: String -> [Token]  Non-deterministic 
Divides the given string into the six categories.
plainCode :: [Token] -> String  Deterministic 
Yields the program code without comments (but with the line breaks for small comments).
unscan :: [Token] -> String  Deterministic 
Inverse function of scan, i.e., unscan (scan x) = x.
readScan :: String -> IO [Token]  Non-deterministic 
return tokens for given filename
testScan :: String -> IO ()  Non-deterministic 
test whether (unscan .

Exported datatypes:


Token

The different categories to classify the source code.

Constructors:

  • SmallComment :: String -> Token
  • BigComment :: String -> Token
  • Text :: String -> Token
  • Letter :: String -> Token
  • Code :: String -> Token
  • ModuleHead :: String -> Token
  • Meta :: String -> Token

Tokens

Type synonym: Tokens = [Token]


Exported operations:

isSmallComment :: Token -> Bool  Deterministic 

test for category "SmallComment"

isBigComment :: Token -> Bool  Deterministic 

test for category "BigComment"

isComment :: Token -> Bool  Deterministic 

test if given token is a comment (big or small)

isText :: Token -> Bool  Deterministic 

test for category "Text" (String)

isLetter :: Token -> Bool  Deterministic 

test for category "Letter" (Char)

isCode :: Token -> Bool  Deterministic 

test for category "Code"

isModuleHead :: Token -> Bool  Deterministic 

test for category "ModuleHead", ie imports and operator declarations

isMeta :: Token -> Bool  Deterministic 

test for category "Meta", ie between {+ and +}

scan :: String -> [Token]  Non-deterministic 

Divides the given string into the six categories. For applications it is important to know whether a given part of code is at the beginning of a line or in the middle. The state scanner organizes the code in such a way that every string categorized as "Code" balways/b starts in the middle of a line.

plainCode :: [Token] -> String  Deterministic 

Yields the program code without comments (but with the line breaks for small comments).

unscan :: [Token] -> String  Deterministic 

Inverse function of scan, i.e., unscan (scan x) = x. unscan is used to yield a program after changing the list of tokens.

readScan :: String -> IO [Token]  Non-deterministic 

return tokens for given filename

testScan :: String -> IO ()  Non-deterministic 

test whether (unscan . scan) is identity