Tutorial: Get Started Writing PEML

 

The Programming Exercise Markup Language (PEML) is designed to provide an ultra-human-friendly authoring format for describing automatically graded programming assignments.

Core Syntax

You probably already know a little about what PEML is and why it was created. Some of the very rudimentary basics of PEML's syntax are shown here:

# Single-line comments start with #
# Comments must be on lines by themselves

key: value

# keys start on the beginning of a line, and are followed by a colon.
# values follow the key (can start on the next line), and continue to
# the next key.

nested.keys.define.structure: Nested keys use dotted notation

quoted.value:----------
Quoted values act like "HereDocs". You can define your own
delimiter (any printing character repeated 3+ times).
----------

Your First PEML File

OK, enough with the syntax. Now start with this minimal version of an exercise and fill in your own content (you can download a slightly longer version here):

exercise_id: <insert-your-globally-unique-id-here>

title: <what title will you use?>

license.id: cc-sa-4.0
license.owner.email: <your email>
license.owner.name: <your name>

instructions:----------
Write instructions for your exercise here.
----------

You can edit your own PEML files locally using any text editor, or you can edit PEML right in your web browser by opening our live editing/validation site in a separate tab:

PEML Live!

The fields in this minimal document are:

exercise_id

Provide a unique identifier for your exercise. Globally unique. Any sequence of non-whitespace characters is ok, but you may wish to stick to existing naming conventions used in other domains. For example, you could use a unique URL (perhaps where this exercise's home definition lives), or something like a Java package name (a dotted name, perhaps including a university's domain name as a prefix), or your email combined with a unique exercise suffix of your own devising.

title

Provide a descriptive title that will be used as a human-readable label for the exercise. The intent is for this to be the "title" shown to students in various contexts, either when viewing a single exercise or when viewing lists of exercises. While there is no specific length limit, ideally titles should be no more than "one line" in size, because of the various contexts where they might be displayed.

license.id

While a license isn't strictly required, it is strongly recommended. The id can be specified by a URL that identifies the license, or by a name (or abbreviated name) that is in common use, such as any of the license keywords used by github (an excellent source for potential license choices). You can even use "(C) 2021 , All rights reserved". While you could just specify an author email using author:, listing a license is better (everyone must assume "all rights reserved" if you do not). See author or license for more details).

license.owner

Probably you. For an individual, either specify a unique, identifying email address, or as shown here, an email address along with a name using separate keys. For an organization, you can specify your information as author:, and then provide license.owner: to specify the organization owning the copyright.

instructions

Provide your exercise's instructions here. This isn't required in all contexts (for example, if providing an auto-grading setup), but you probably want to.

Now you have a PEML description!

Identifying the Programming Language

If your exercise is a programming exercise, you probably will find it useful to identify the programming language it supports. PEML does this by allowing you to define the "programming system".

[systems]
language: Java
version: >= 1.9
[]

The language: key is the required one--specify the language that is supported using its common name (be careful of capitalization, since some tools processing descriptions might not treat the name case-insensitively), or its MIME type (to reduce ambiguity). Optionally, you can also specify a version, if your supporting files are version-dependent. Feel free to use gem-style version dependency constraints, although check with your educational tool to determine what is supported.

Note that the language: and version: keys here are listed inside a [systems] array. There's only one element in the array in this example, but PEML does allow authors to express exercises that support multiple programming systems. You can ignore that for now. However, this is a good chance to recap PEML's array notation.

In short, arrays (lists) in PEML have keys that are surrounded by square brackets instead of using a colon. The end of the list is marked with a pair of empty square brackets (which can be omitted if they are at the end of the file). All the keys between these two markers are part of the list. Like ArchieML, if you look at the keys inside, as soon as a key is duplicated, that is taken as the start of a new entry in the list. So an array with multiple entries might look like this:

[systems]
language: Java
version: >= 1.9

language: Python
version: >= 3
[]

Associating Supporting Files

OK, so where's all the cool stuff? Like auto-grader inputs and all that?

PEML provides a very rich model for structuring this information for tool use. However, PEML relies heavily on convention over configuration to simplify the way those things are managed and to make it easier for authors to learn the minimal amount they need, and gradually add onto that core over time as more advanced situations arise.

Starting Files Provided to the Student

Suppose you want to provide some file(s) to the student as the starting point for their solution. Just add them in the src/starter folder next to the PEML description itself. We recommend placing each exercise that uses additional resources in its own directory. This approach works whether the PEML description is located in a folder on the local machine, is packaged in a zip file or another form of archive, or hosted in a repository. You could also use the [src.starter.files] array key to provide this information in the PEML description itself, implicitly providing them by co-location is often simpler.

# Providing starter files for the user
dir
|-- exercise.peml
+-- src
    +-- starter
        |-- file1.ext
        |-- file2.ext
        +-- file3.ext

Images Used in the Instructions

Suppose you want to provide images for use in your instructions. Instructions are typically written in Markdown or vanilla HTML, but can certainly refer to supplemental files, whether they be images, separate pages describing APIs, examples students can download, etc. While you could host these resources on your own website and use absolute URLs, you may wish for them to be packaged with the exercise. You can use the [public_html] key to specify these explicitly, or simply place them in a public_html folder.

# Providing images files for the instructions
dir
|-- exercise.peml
+-- public_html
|   |-- image1.png
|   |-- image2.png
|   +-- download_file.dat
+-- src
    +-- starter
        |-- file1.ext
        |-- file2.ext
        +-- file3.ext

Test Case Files for Auto-grading

Suppose you want to provide some file(s) specifying the tests you want to use to check the behavior of answers to the exercise. These could be in the form of compilable program code, scripts, data files, or whatever notation/format is used by the auto-grading tool reading your PEML description. You can provide these as separate files under the suites folder, which corresponds to the [suites] array key.

# Providing test case files
dir
|-- exercise.peml
+-- public_html
|   |-- image1.png
|   |-- image2.png
|   +-- download_file.dat
+-- src
|   +-- starter
|       |-- file1.ext
|       |-- file2.ext
|       +-- file3.ext
+-- suites
    |-- TestClass1.java
    +-- TestClass2.java

Other Supplemental Files for Auto-grading

Suppose your auto-grading tests use additional data files or other resources that need to be available during execution of your tests. Not all grading tools support such resources, but if they do, you can provide them in the environment/test folder, which corresponds to the [environment.test.files] key.

Note: If you'd rather use a docker image to provide the environmental setup for environments (for building, running, testing, or even the student's starting environmet) and your tool supports this kind of usage, you probably want to just specify image information directly in the PEML file--see Environments in the data model).

# Providing data resources for use in testing
dir
|-- exercise.peml
+-- public_html
|   |-- image1.png
|   |-- image2.png
|   +-- download_file.dat
+-- src
|   +-- starter
|       |-- file1.ext
|       |-- file2.ext
|       +-- file3.ext
+-- suites
|   |-- TestClass1.java
|   +-- TestClass2.java
+-- environment
    +-- test
        |-- file4.ext
        |-- file5.ext
        +-- file6.ext

System-specific Files

Top-level keys like src.*, environment.*, or [suites] affect the whole exercise, which means they apply to all programming systems that the exercise supports. When the exercise only targets one programming language, that's probably fine. If, however, you write your exercise so that it supports multiple programming systems, you may wish to provide different resources for each system. You can do that by using the same directory structure for src/, environment/, and suites/, but placing them underneath systems/<language> (using the language name as specified in the PEML file--if a MIME type is used, replace the '/' in the MIME type with a '-').

Note: Most tools do not support exercises that support multiple programming systems. However, they should still support the first system listed, including system-specific settings provided in the way described here.

# Providing system-specific resources
dir
|-- exercise.peml
+-- systems
    +-- Java
    |   +-- src
    |   |   +-- starter
    |   |       |-- Class1.java
    |   |       +-- Class2.java
    |   +-- suites
    |   |   |-- TestClass1.java
    |   |   +-- TestClass2.java
    |   +-- environment
    |       +-- test
    |           |-- file3.ext
    |           +-- file4.ext
    +-- python
        +-- src
        |   +-- starter
        |       |-- class1.py
        |       +-- class2.py
        +-- suites
        |   |-- test_class1.py
        |   +-- test_class2.py
        +-- environment
            +-- test
                |-- file3.ext
                +-- file4.ext

If files are specified at both the global and system-specific levels, the files available are the union of both, where files with the same path names in both locations are overridden by the system-specific contents.

Using Data-driven Test Suites for Simple Cases

For exercises that have relatively simple scaffolding requirements for testing--that is, most or all of the tests follow the same basic format, but vary in some standardized ways such as different inputs or outputs--you may find writing test data directly into your PEML description to be more convenient than providing separate test cases.

For example, maybe you are working with a tool that only tests standard input/standard output behaviors, so every test you run consists only of a pair of input/output values. Then you might be able to describe your tests this way:

[suites]
[.cases]

stdin: racecar
stdout: "racecar" is a palindrome.

stdin: Flintstone
stdout: "Flintstone" is not a palindrome.

[]
[]
# The [] ends/closes a list of items. The first []
# closes the list of cases in one suite, while the second [] closes
# the list of suites, which here includes only one suite.

Here, the [suites] consist of a single test suite that contains two test cases. Each test case here contains two values, named "stdin" and "stdout". The tool translating this input into test cases would need to recognize those names and know how to use them, so check with your tool regarding support for this format and required naming conventions for variables. However, this is a fairly simple way to mark up the values when the situation permits.

In fact, the same content could be written in CSV, YAML, JSON, etc., depending on what your grading tool supports.

[suites]

name: csv-version.csv
type: text/csv
content:----------
stdin,stdout
"racecar","""racecar"" is a palindrome."
"Flintstone","""Flintstone"" is not a palindrome."
----------

name: yaml-version.yaml
type: text/x-yaml
content:----------
---
- stdin: racecar
  stdout: "\"racecar\" is a palindrome."
- stdin: Flintstone
  stdout: "\"Flintstone\" is not a palindrome."
----------

[]

Here, the MIME types for the files were specified, but they can also be deduced from the file names, so the MIME type is redudant (not required). Also, because of the error-prone nature of manually quoting data in CSV format, PEML supports an alternative "text/x-unquoted-csv" where values are written in the same notation that would be used in the target programming system, allowing more natural use of expressions, native literal constructs, escapes, etc.

If you are lucky enough to have a tool that supports it, you may be able to provide a template used to translate a single test case record into executable code. You might even be able to designate tests as privately visible to instructors only, or publicly visible to students. For example:

[suites]
name: csv-version.csv
visibility: public
template:
    assertEquals(" *  == ",
        , Answer.multiply(, ));
content:----------
x,y,expected
2,4,8
1,1,1
-2,3,-6
----------
[]

What Next?

If you haven't read through the whole PEML Introduction, that is your next step.

If you want to know more about how PEML came to be, why we're not using straight YAML or JSON, PEML's design goals, and its influences, read our About PEML page.

Finally, start digging into the PEML definition.