Let’s test and define one of the best ways to upload and download documents using the SAP Document Management Service,
Document Upload to Server usually takes up memory and disk space so let’s see if we can optimize on some of the things required.
Let’s build a Node.js server & compute the difference between the different approaches of uploading and downloading files
Node.js is built on an event-driven, non-blocking I/O model, making it particularly suitable for handling concurrent I/O operations. This asynchronous nature allows Node.js to efficiently handle a large number of concurrent connections without the overhead of threads, making it well-suited for applications requiring for high scalability.
We will be using the SAP Document Management Service Client mentioned in this blog Node Js Client
const REPOSITORY_ID = "com.demo.test.sdm";
let sm = new CmisSessionManager(sdmCredentials);
// create Repository if not there
// await sm.createRepositoryIfNotExists(REPOSITORY_ID, "provider", {});
const app = express();
const port = 3000;
app.use(cors());
app.get('/', (req, res) => {
res.send('This is a test Server!');
});
app.listen(port, () => {
console.log(`Server is running on port ${port}`);
});
This is a simple node.js server that accepts incoming requests.
Let’s create a download API, that will receive a object path from client and return the document to the client, but the catch is that the process should be very memory efficient.
app.get('/download', async (req, res) => {
// create or reuse cmis session
let session = await sm.getOrCreateConnection(REPOSITORY_ID, "provider");
// get object by path req.param.objectPath = "/temp/doc.pdf"
let obj = await session.getObjectByPath(req.param.objectPath);
// get the content Stream
let result = await session.getContentStream(obj.succinctProperties["cmis:objectId"]);
// update the needed Headers
res.header('Content-Type', obj.succinctProperties["cmis:contentStreamMimeType"]);
res.header('Content-Length', obj.succinctProperties["cmis:contentStreamLength"]);
res.header('Content-Disposition', `attachment; filename="${obj.succinctProperties["cmis:name"]}"`);
// pipe the doc store response to the client
result.body.pipe(res);
});
The above approach has the following advantages:
1.Memory Efficiency: Streams process data in smaller chunks, allowing you to handle files larger than available memory.
2.Scalability: When dealing with multiple streams of data or concurrent I/O operations, using streams enables efficient handling of concurrent requests without blocking other operations.
3.Reduced Latency: Streams can start processing data as soon as they receive the initial chunks, reducing overall latency.
Let’s create a upload API using Multer, where the client will upload a single file and the server will upload the single file to DMS, but we have to do in the most memory efficient way.
// Set up storage for uploaded files
const storage = multer.diskStorage({
destination: function (req, file, cb) {
cb(null, 'uploads/');
}
});
// Create multer instance with the storage configuration
const upload = multer({storage: storage})
app.post('/upload', upload.single('file'), async (req, res) => {
if (req.file) {
// create or reuse DMS session
let session = await sm.getOrCreateConnection(REPOSITORY_ID, "provider");
// multer collects a file parts and returns temp file path.
// let's create a read stream from that path
let readStream = fs.createReadStream(req.file.path);
// upload the file to DMS
let response = await session.createDocumentFromStream("/temp", readStream, req.file.originalname)
res.status(200).end(response.data);
} else {
res.status(400).end('Uploaded File not found.');
}
});
In this approach we are using Multer to parse the multipart form data, where the file is getting uploaded.As we are using disk storage options, the uploaded file will get cached in the server’s disk and the Multer will pass a file reference along with REQ for the controller to use.Next, we will stream the file to the DMS and storing it there.
Results:
Total Data Uploaded : 2 GB
No of files : 10 txt files
Time Taken: 12 mins
Total Memory Used: 3.8 GB
Served Disk Used : 2.1 GB
Some advantages & disadvantages are:
Advantage:
Disadvantages
This approach is only good when you are prototyping for MVP or small files.
Lets create another upload API where we try to remove the Multer disk & memory limitation.
app.post('/upload-optimised', async (req, res) => {
// get the file metadata from custom headers.
const fileName = req.headers["cs-filename"];
const opType = req.headers["cs-operation"];
const mimeType = req.headers["content-type"];
// create or reuse the DMS session
let session = await sm.getOrCreateConnection(REPOSITORY_ID, "provider");
let response = {success: "false"};
// if operation is "create" then create the document in DMS with initial chuck
if (opType === "create") {
// create a document from the response stream
response = await session.createDocumentFromStream("/temp", req, fileName);
}
// if operation is "append" then append the content an existing file
if (opType === "append") {
const obj = await session.getObjectByPath("/temp/" + fileName);
// get the object id from the object path.
const objId = obj.succinctProperties["cmis:objectId"];
// append the content to the previously created file.
response = await session.appendContentFromStream(objId, req);
}
res.json(response);
});
In this approach, we are appending the file content over multiple HTTP request, because of this a custom client handling is required.
This is a HTML form that uploads a file to our server, but it manually breaks the code into chunks and uses append functionality.
<html>
<head>
<title>TEST</title>
</head>
<body>
<h1>Upload a File</h1>
<div>
<input id="uploadFile" type="file" name="fileInput">
<input value="Upload" onclick="uploadFile(this)">
</div>
<script>
// trigger when upload button is clicked
function uploadFile(event) {
let elementById = document.getElementById("uploadFile");
// get the selected file
const file = elementById.files[0];
if (file) {
// read the file content and upload in chunks
const reader = new FileReader();
reader.onload = function (event) {
const contents = event.target.result;
console.log('File contents:', contents.length);
uploadFileInChunks(file, contents);
};
reader.readAsText(file);
}
}
async function uploadFileInChunks(file, content) {
// Specify your desired chunk size
const chunkSize = 1024;
// console.log(content);
// total no of chunks to be uploaded, may be created a progress bar
const totalChunks = Math.ceil(content.length / chunkSize);
for (let i = 0; i < totalChunks; i++) {
// calculate start of the chunk
const start = i * chunkSize;
// calculate end of chuck
const end = Math.min(start + chunkSize, content.length);
// get the chunk from the entire content
const chunk = content.slice(start, end);
// Process the chunk
console.log('Chunk', i + 1, 'of', totalChunks, ':', chunk);
// create if first chunk or else append
const operation = i === 0 ? "create" : "append";
const myHeaders = new Headers();
myHeaders.append("cs-filename", file.name);
myHeaders.append("cs-operation", operation);
myHeaders.append("Content-Type", file.type);
const requestOptions = {
method: 'POST',
headers: myHeaders,
body: chunk,
redirect: 'follow'
};
// upload to the server
const response = await fetch("http://localhost:3000/upload-optimised/", requestOptions);
console.log(await response.json());
}
}
</script>
</body>
</html>
Results:
Total Data Uploaded : 2 GB
No of files : 10 txt files
Time Taken: 9 mins
Total Memory Used: 200 Mb
Served Disk Used : 0 bytes
From the results we are very clear about the amount of server resource usage.
This method has its own set of advantages and disadvantages:
Advantages of using the Append Approach:
Disadvantages of using the Append Approach:
The choice of approach depends on specific project requirements, including file sizes, network conditions, server capabilities, and desired functionalities. Each approach has its strengths and trade-offs, and selecting the most suitable approach should be based on a careful consideration of these factors.