Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add csv and xlsx formats #61

Merged
merged 43 commits into from
Apr 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
79d29a2
add csv format
Apr 2, 2024
69c786d
improve csv output stream
Apr 3, 2024
f7366ab
simplify conditions
Apr 3, 2024
9d29dd9
import csv WIP, an iceberg mess
Apr 3, 2024
320d4e0
wip
Apr 4, 2024
5fbdf54
move canImport to import.js and canImportOrExport to index.js
Apr 5, 2024
04a5343
export: separate concerns between apos related stuff and formats
Apr 5, 2024
8c35515
comment
Apr 5, 2024
0f015b3
import: separate concerns between apos related stuff and formats
Apr 5, 2024
9a92776
import: fix remove error
Apr 9, 2024
198950d
add logs and clean existing ones
Apr 9, 2024
b90b8bd
separation of concerns: stop filtering docs in format script
Apr 9, 2024
9414687
add xslx format
Apr 9, 2024
6521d2d
fix select line-height
Apr 9, 2024
b59c99f
better use of options and default values
Apr 10, 2024
a8965cf
clean csv options
Apr 10, 2024
4a7247e
csv and xlsx parsing - wip
Apr 10, 2024
cb952c4
clean
Apr 10, 2024
24704ef
thanks to cypress, fix a bug when overriding locales and clarify code
Apr 10, 2024
86df216
refactor remove functions
Apr 11, 2024
d7e4882
hmmm
Apr 11, 2024
d348255
remove instantly from uploadfs (to discuss)
Apr 16, 2024
778fa4f
Revert "remove instantly from uploadfs (to discuss)"
Apr 16, 2024
d14de39
add doc to add new formats
Apr 16, 2024
132616d
revert to input and output
Apr 17, 2024
c2e6bee
tests wip
Apr 22, 2024
ad52ded
wip
Apr 22, 2024
7c055c8
remove xlsx format because moved to a separate module
Apr 24, 2024
74f1b0a
better readme
Apr 24, 2024
0f345db
add registerFormats method
Apr 24, 2024
7fb0a84
edit doc with the registration
Apr 24, 2024
2a0a253
verify registered formats
Apr 29, 2024
4ddd456
verify registered formats - better
Apr 29, 2024
f094c91
verify registered formats - better
Apr 29, 2024
bb4594a
verify registered formats - better
Apr 29, 2024
5dfa803
adapt tests - the end
Apr 29, 2024
bf3c24c
clean logs and todos
Apr 29, 2024
a9de590
changelog
Apr 29, 2024
c6de7ed
clean data after each test
Apr 29, 2024
29bf7f6
Merge remote-tracking branch 'origin/main' into pro-5809-add-csv-format
Apr 29, 2024
7d791cd
changelog major version warning
Apr 29, 2024
e6007cc
remove try/catch when cleaning data in tests
Apr 29, 2024
08f0481
remove stuff after each test to fix ci, maybe??
Apr 29, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,17 @@

* Corrects documentation of required permissions.

### Adds

* Add CSV format.

### Breaking changes

**⚠️ The major version should be incremented: `2.0.0`. Please remove this line before releasing the module.**

* The signature of the `output` function from the gzip format has changed. It no longer takes the `apos` instance and now requires a `processAttachments` callback.
* `import` and `overrideDuplicates` functions now require `formatLabel` to be passed in `req`.

## 1.4.1 (2024-03-20)

### Changes
Expand Down
167 changes: 167 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -147,3 +147,170 @@ Exported documents maintain their locale settings. If the locale during import d
If multiple locales are set up, the user will be prompted to choose between canceling the import or proceeding with it.

![Screenshot highlighting the confirm modal letting the user choose between aborting on continuing the import when the docs locale is different from the site one.](https://static.apostrophecms.com/apostrophecms/import-export/images/different-locale-modal.png)

## How to add a new format?

### Create a file for your format:

Add your format under `lib/formats/<format_name>.js` and export it in l`ib/formats/index.js`.

**Simple example** (for a single file without attachment files):

```js
// lib/formats/ods.js
module.exports = {
label: 'ODS',
extension: '.ods',
allowedExtension: '.ods',
allowedTypes: [ 'application/vnd.oasis.opendocument.spreadsheet' ],
async input(filepath) {
// Read `filepath` using `fs.createReadStream`
// or any reader provided by a third-party library

// Return parsed docs as an array
return { docs };
},
async output(filepath, { docs }) {
// Write `docs` into `filepath` using `fs.createWriteStream`
// or any writer provided by a third-party library
}
};
```

**Note**: The `input` and `output` functions should remain agnostic of any apostrophe logic.

```js
// lib/formats/index.js
const ods = require('./ods');

module.exports = {
// ...
ods
};
```

### For formats with attachment files:

If you want to add a format that includes attachment files such as an archive, you can enable the `includeAttachments` option and utilize extra arguments provided in the `input` and `output` functions.

**Advanced example**:

```js
// lib/formats/zip.js
module.exports = {
label: 'ZIP',
extension: '.zip',
allowedExtension: '.zip',
allowedTypes: [
'application/zip',
'application/x-zip',
'application/x-zip-compressed'
],
includeAttachments: true,
async input(filepath) {
let exportPath = filepath;

// If the given path is the archive, we first need to extract it
// and define `exportPath` to the extracted folder, not the archive
if (filepath.endsWith(this.allowedExtension)) {
exportPath = filepath.replace(this.allowedExtension, '');

// Use format-specif extraction
await extract(filepath, exportPath);
await fsp.unlink(filepath);
}

// Read docs and attachments from `exportPath`
// given that they are stored in aposDocs.json and aposAttachments.json files:
const docs = await fsp.readFile(path.join(exportPath, 'aposDocs.json'));
const attachments = await fsp.readFile(path.join(exportPath, 'aposAttachments.json'));
const parsedDocs = EJSON.parse(docs);
const parsedAttachments = EJSON.parse(attachments);

// Add the attachment names and their path where they are going to be written to
const attachmentsInfo = parsedAttachments.map(attachment => ({
attachment,
file: {
name: `${attachment.name}.${attachment.extension}`,
path: path.join(exportPath, 'attachments', `${attachment._id}-${attachment.name}.${attachment.extension}`)
}
}));

// Return parsed docs as an array, attachments with their extra files info
// and `exportPath` since it we need to inform the caller where the extracted data is:
return {
docs: parsedDocs,
attachmentsInfo,
exportPath
};
},
async output(
filepath,
{
docs,
attachments = [],
attachmentUrls = {}
},
processAttachments
) {
// Store the docs and attachments into `aposDocs.json` and `aposAttachments.json` files
// and add them to the archive

// Create a `attachments/` directory in the archive and store the attachment files inside it:
const addAttachment = async (attachmentPath, name, size) => {
// Read attachment from `attachmentPath`
// and store it into `attachments/<name>` inside the archive
}
const { attachmentError } = await processAttachments(attachmentUrls, addAttachment);

// Write the archive that contains `aposDocs.json`, `aposAttachments.json` and `attachments/`
// into `filepath` using `fs.createWriteStream` or any writer provided by a third-party library

// Return potential attachment processing error so that the caller is aware of it:
return { attachmentError };
}
};
```

### Add formats via a separate module

You might want to scope one or multiple formats in another module for several reasons:

- The formats rely on a dependency that is not hosted on NPM (which is the case with [@apostrophecms/import-export-xlsx](https://github.com/apostrophecms/import-export-xlsx))
- You want to fully scope the format in a separate module and repository for an easier maintenance
- ...

To do so, simply create an apostrophe module that improves `@apostrophecms/import-export` and register the formats in the `init` method.


Example with an `import-export-excel` module:

```js
const formats: {
xls: {
label: 'XLS',
extension: '.xls',
allowedExtension: '.xls',
allowedTypes: [ 'application/vnd.ms-excel' ],
async input(filepath) {},
async output(filepath, { docs }) {}
},
xlsx: {
label: 'XLSX',
extension: '.xlsx',
allowedExtension: '.xlsx',
allowedTypes: [ 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet' ],
async input(filepath) {},
async output(filepath, { docs }) {}
}
};

module.exports = {
improve: '@apostrophecms/import-export',
init(self) {
self.registerFormats(formats);
}
};
```

Then add the module to the project **package.json** and **app.js**.
6 changes: 3 additions & 3 deletions index.js
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ const fs = require('fs');
const path = require('path');
const methods = require('./lib/methods');
const apiRoutes = require('./lib/apiRoutes');
const gzip = require('./lib/formats/gzip');
const formats = require('./lib/formats');

module.exports = {
bundle: {
Expand All @@ -21,8 +21,8 @@ module.exports = {
},
init(self) {
self.formats = {
gzip,
...(self.options.formats || {})
...formats,
...self.options.formats || {}
};

self.enableBrowserData();
Expand Down
90 changes: 90 additions & 0 deletions lib/formats/csv.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
const fs = require('node:fs');
const { stringify } = require('csv-stringify');
const { parse } = require('csv-parse');

module.exports = {
label: 'CSV',
extension: '.csv',
allowedExtension: '.csv',
allowedTypes: [ 'text/csv' ],
async input(filepath) {
const reader = fs.createReadStream(filepath);
const parser = reader
.pipe(
parse({
columns: true,
bom: true,
cast(value, context) {
if (context.header) {
return value;
}

try {
return JSON.parse(value);
} catch {
return value;
}
}
})
);

const docs = [];

parser.on('readable', function() {
let doc;
while ((doc = parser.read()) !== null) {
docs.push(doc);
}
});

return new Promise((resolve, reject) => {
reader.on('error', reject);
parser.on('error', reject);
parser.on('end', () => {
console.info(`[csv] docs read from ${filepath}`);
resolve({ docs });
});
});
},
async output(filepath, { docs }) {
const writer = fs.createWriteStream(filepath);
const stringifier = stringify({
header: true,
columns: getColumnsNames(docs),
cast: {
date(value) {
return value.toISOString();
},
boolean(value) {
return value ? 'true' : 'false';
}
}
});

stringifier.pipe(writer);

// plunge each doc into the stream
docs.forEach(record => {
stringifier.write(record);
});

stringifier.end();

return new Promise((resolve, reject) => {
stringifier.on('error', reject);
writer.on('error', reject);
writer.on('finish', () => {
console.info(`[csv] export file written to ${filepath}`);
resolve();
});
});
}
};

function getColumnsNames(docs) {
const columns = new Set();
docs.forEach(doc => {
Object.keys(doc).forEach(key => columns.add(key));
});
return Array.from(columns);
}
Loading
Loading