XML Converter, PAIGE to CC

XML Converter, PAIGE to CC

Detailed Specifications of XML_converter_paige_to_cc

1. Overview

The XML_converter_paige_to_cc program is designed to convert XML files from the PAIGE format to the CC (Content Central) XML format. It also handles PDF files by moving them to a specified destination folder. The program processes XML and PDF files, performs validation checks, logs events, and reads a CSV configuration file for mapping document types.


2. Features and Functionalities

The program incorporates several key functionalities:

2.1 XML Conversion from PAIGE to CC

  • Reading and Validation: The program reads XML files and verifies if they adhere to the PAIGE format.
  • Conversion: Valid PAIGE XML files are transformed into the CC XML format.
  • Mapping Management: Document type mappings between PAIGE and CC are managed through a CSV configuration file.
  • Storage: Converted CC XML files are saved in the destination folder, replacing the original PAIGE XML files.

2.2 Handling PDF Files

  • Detection and Movement: The program identifies PDF files and moves them from the source folder to the destination folder.

2.3 Handling Non-XML and Non-PDF Files

  • Unsupported Files: Files that are neither XML nor PDF are considered unsupported. The program deletes these files to maintain the integrity of the processing environment.

2.4 Logging

  • Event Recording: When enabled, the program logs events such as errors, successful conversions, and file movements.
  • Log Storage: Logs are stored in a file named <executable_name>_log.txt.

2.5 Configuration Management via CSV

  • Mapping Configuration: A CSV file is used to map PAIGE document types to CC catalog values and document types.
  • Fallback Mechanism: The first entry after the header in the CSV serves as a fallback if a document type is not found in the mapping.

2.6 Command-Line Arguments Handling

The program expects five command-line arguments:

./converter <filename> <xml_source_folder> <xml_destination_folder> <config_file> <enable_logging>

Where:

  • <filename>: Name of the file to process.
  • <xml_source_folder>: Directory containing the source XML files.
  • <xml_destination_folder>: Directory where converted XML files should be saved.
  • <config_file>: CSV file containing the document type mappings.
  • <enable_logging>TRUE to enable logging, FALSE to disable it.

3. Configuration File Specifications

3.1 Purpose

The CSV configuration file maps PAIGE document types to CC catalog values and CC document types, ensuring accurate conversion.

3.2 CSV File Format

The configuration file should be a standard CSV file with three columns:

PAIGE Document TypeCC Catalog ValueCC Document Type
InvoiceFinancialAccounts
ReportBusinessAnalysis
ContractLegalAgreement
  • Column 1 (PAIGE Document Type): Document type in the PAIGE XML.
  • Column 2 (CC Catalog Value): Corresponding catalog value for the CC XML.
  • Column 3 (CC Document Type): Corresponding document type for the CC XML.

3.3 Example Configuration File (mapping.csv)

PAIGE Document Type,CC Catalog Value,CC Document Type
Invoice,Financial,Accounts
Report,Business,Analysis
Contract,Legal,Agreement

3.4 Creating the Configuration File

To create the CSV file:

  1. Using a Text Editor:

    • Open a text editor (e.g., Notepad, VS Code).
    • Copy the format from the example and modify it as needed.
    • Save the file as mapping.csv in comma-separated values (CSV) format.
  2. Using a Spreadsheet Editor:

    • Open Excel or Google Sheets.
    • Enter the values in three columns as shown in the table above.
    • Save the file as CSV (Comma delimited) (*.csv).

4. Technical Breakdown

4.1 Data Structures

  • PaigeExport: Struct for parsing PAIGE XML files.
  • CCXML: Struct for generating CC XML files.
  • Document Type Mapping: A map (documentTypeMap) to store document type mappings from the CSV file.

4.2 Key Functions

  • logMessage(enableLogging bool, message string)
    Handles logging of messages to a log file if logging is enabled.

  • getLogFileName()
    Returns the log file's name based on the executable name.

  • isPaigeXML(paige PaigeExport) bool
    Validates whether the provided XML follows the PAIGE format.

  • convertPaigeToCC(paige PaigeExport) CCXML
    Converts a PAIGE XML struct to a CC XML struct.

  • loadConfig(configFile string) error
    Reads the CSV configuration file and populates the document type mapping.

  • processFile(filename, sourceFolder, destinationFolder, configFile string, enableLogging bool)
    Processes a file based on its type (PDF or XML):

    • Moves PDFs.
    • Converts valid PAIGE XML to CC XML.
    • Deletes invalid XML files and unsupported file types.
  • main()
    Handles command-line arguments, initializes logging, loads the configuration, and processes files.


5. Execution Instructions

5.1 Execution

Run the program with:

./converter filename.xml /source/path /destination/path config.csv TRUE

Example:

./converter sample.xml /home/user/source /home/user/destination mapping.csv TRUE

5.2 XML Destination Folder Configuration

The <xml_destination_folder> must be configured as an XML incoming folder in Content Central. This ensures that the converted CC XML files are properly ingested into the document management system for further processing. Refer to Ademero Support for instructions on configuring an XML capture job in Content Central.


6. Error Handling and Logging

  • Missing or Invalid Configuration File:

    • If the CSV mapping file is not found or is incorrectly formatted, the program exits with an error message.
  • Invalid XML Format:

    • If an XML file is not in PAIGE format, it is deleted to prevent incorrect data processing.
  • Unrecognized File Types:

    • Any file that is not .xml or .pdf is deleted.
  • Logging Behavior:

    • The log file is written synchronously to ensure immediate updates.

7. Strengths and Improvements

Strengths

✔ Robust Validation: Ensures PAIGE XML follows a correct format before processing.
✔ Modular Design: Functions are well-structured for easy maintenance.
✔ Error Handling: Uses structured error handling for missing files, incorrect mappings, etc.
✔ Logging Support: Provides detailed logs for tracking operations.
✔ CSV-Based Configuration: Allows dynamic mapping between document types.

Potential Improvements

🔹 Multi-File Processing: Currently, the program processes one file at a time. It could be enhanced to process multiple files in a directory.
🔹 Logging Optimization: Logs are written synchronously, which could slow down execution. Consider using buffered logging.
🔹 Parallel Processing: Utilizing Goroutines to process multiple files concurrently would improve performance.


8. Conclusion

The XML_converter_paige_to_cc program is well-structured and effectively converts PAIGE XML to CC XML while handling PDFs. It includes a strong validation mechanism, logging, and configuration support. Proper configuration of the XML destination folder ensures seamless integration with Content Central.

    • Related Articles

    • XML Capture Job

      This applies to Content Central v7. XML Capture Job Capture Job allows users to capture or import documents to Content Central using XML files as a document descriptor for each document to capture. XML Capture Job Document Descriptor Catalog Names to ...
    • CapturePoint Process

      CAPTUREPOINT PROCESS CAPTUREPOINT PROCESS with PAIGE CAPTUREPOINT PROCESS with Document Layouts CAPTUREPOINT PROCESS with Document Layouts and PAIGE
    • Capture Job

      Capture Job Feature With Content Central, you can add documents by dropping them into a monitored folder or incoming folder, or by sending them to a monitored e‐mail address, but first you must configure a capture job. Capture Jobs can be configured ...
    • Email Capture Job

      This applies to Content Central v7.5.6866. With Content Central, you can add documents by sending them to a monitored e‐mail address, but first you must configure an Email Capture Job. Email Capture Jobs can be configured in 'Catalog Manager' in the ...
    • CapturePoint Instructions

      These instructions apply to CapturePoint v5.0.65 and above. CapturePoint Setup CapturePoint Installation Download CapturePoint software from the download link. Please contact support@ademero.com to inquire about CapturePoint Download Link. ...