DLP/Samples

We are always looking for sample files of various file formats. We need them for regression test suites for the existing libraries, but also for introspecting unknown formats (e.g., these on the "suggested" list, but not limited to it). We are not committed to write a filter for any particular format, but having some files (even better, a person who can create more on request) definitely promotes the chance for that happening :-)

In case you are interested and want to help, read further to see documents we are looking for and how to create samples for regression testing and for introspection.

Creating sample documents for regression testing
These documents will be publicly available in a regression test repository, so we need an acknowledgement that they are available under CC-BY-SA license. Files of unknown origin (e.g., found on the Internet) are not acceptable. Please try to follow the following suggestions when creating sample sets:


 * Create small documents, each covering one isolated feature (or sub-feature) of the format, e.g., paragraph formatting, character formatting, headers/footers, shape transformations, etc.
 * Create documents for a _single_ version of the application. Always specify the version and operating system (Windows, macOS, Linux, OS/2, DOS etc.) when you are contributing the files to us.
 * Use at least somewhat meaningful file names, e.g., sample1.xyz is bad, ellipse.xyz or footnote.xyz is good.
 * Samples do not have to be created from scratch: it is possible to take an existing set (provided there is one :-) and save the files using a newer (or older) version of the file format. Just please tell us that you have done that.

What features should be covered
The following list is a rough guide for features that should be covered by sample documents (provided that the application supports them). Ideally, every bullet point should be covered by a single document or a small number of documents.


 * all predefined shapes (or a representative sample, if there is too many of them)
 * shape transformations, e.g., for a simple shape and a complex shape (bezier curves or NURBS)
 * shape fills
 * line strokes
 * various arrows for line endings
 * use of layers
 * text in different languages (it does not have to be gramatically or stylistically correct--automatic translation is okay)
 * text in scripts using different writing directions (Chinese, Arabic, Hebrew, ...)
 * tables (including use of merged cells and non-default borders)
 * grouped shapes
 * more text properties, e.g., a different font, font size, subscript, superscript, color
 * paragraph properties, e.g., justification, first line indentation, line height
 * paragraph rules/borders
 * tabs
 * styles
 * bitmap images, ideally using various different input formats (e.g., JPEG, PNG, BMP)
 * embedded fonts
 * document metadata, e.g., title, author, description, keywords
 * use of master pages (or page masters, page styles, page templates, or whatever else it is called)
 * shape connectors
 * headers and footers
 * footnotes
 * embedded formulas
 * different types of image anchoring (to page, to paragraph, etc.)
 * different page sizes in a single document
 * multi-column text
 * fields (page count, page number etc.)

Wanted samples

 * Apple Keynote 5.0-5.2.
 * Just a single file is needed, to ensure that the version string is detected correctly.
 * CorelDRAW - all versions.
 * CorelDRAW Exchange (CMX).
 * Both Windows and Mac files.
 * Microsoft Visio - all versions.

Creating sample documents for introspecting unknown format
These documents would only be used for our internal needs, so any file is fine, regardless of its source. That means that files randomly collected on the Internet are fine--after all, some samples are better than no samples. If you are creating a sample document yourself, please try to follow the general suggestions described above, with the following additions:


 * Create very minimal documents (e.g., a single text line or a single shape).
 * Create several variations of the same base document, e.g., a sligtly moved or resized shape, a couple of characters added to a paragraph, etc.
 * Use at least somewhat meaningful file names, e.g., sample1.xyz is bad, ellipse.xyz or footnote.xyz is good. Variations of a base document can be numbered.
 * Export each document to PDF or make a screenshot of it opened in the original application, if possible.

Contributing sample documents
There are 2 ways to get your sample documents to us. The first is via e-mail or some internet storage service. The second is via Gerrit; this is only available in certain cases. Both are described in more detail in the following text. Please never attach packs of sample documents to bugzilla: it has a size limit for attachments.

Via e-mail
Pack all your sample files into a single archive (e.g., zip), upload them to some internet storage service and send info about it to [mailto:info@documentliberation.org info@documentliberation.org]. The e-mail should have subject "sample documents for " and you should include info about application version, operating system (macOS, Windows, Linux, OS/2, DOS etc.) and origin of the documents (created by you; or from other sources). For documents created by yourself, we assume that you agree with providing them under CC-BY-SA 4.0 license.

Alternatively, you can send your files (preferably in a single archive) attached to an e-mail to [mailto:dtardon@redhat.com dtardon@redhat.com] or [mailto:fridrich.strba@libreoffice.org fridrich.strba@libreoffice.org]. The subject and content of the e-mail should be the same as explained in the previous paragraph.

Via Gerrit
Several of our import libraries and their test suites are hosted on http://gerrit.libreoffice.org. That means that the way to contribute to them is the same as for, e.g., libreoffice. The needed info about our gerrit setup is available here. This is not so convenient for one-time contributions, as the initial setup takes some effort, but it makes further contributions much easier.

The test repositories that are available in that way are libabw-test, libcdr-test, libetonyek-test, libfreehand-test, libmspub-test, libpagemaker-test and libvisio-test. You need to clone the repository that should contain test files of the format you prepared (which one it is should hopefully be clear from the repository names. If it is not, this page contains overview of all the libraries and formats they can handle).

For adding your files, you can use the simple method:


 * add your sample files to a new directory created in the cloned repository;
 * commit, putting info about the files (the same as described above in the section about e-mail) into the commit message;
 * send for review.

Or the slightly more complicated method (but only if you know what you are doing):


 * add your sample files to a specific subdirectory of directory ;
 * add your name and e-mail address to, if it is not already there;
 * run  (optional, you have to have a fresh master build of writerperfect and the library the test files are for);
 * commit, putting info about the files (the same as described above in the section about e-mail) into the commit message;
 * send for review.