K-Box and K-Link Deep Dive

Alessio Vertemati

Dive in some K-Box and K-Link internals

  • Inside the K-Box application
  • How we handle documents and videos
  • Inside the K-Link
  • Deploy strategies

K-Box

Outside of the K-Box

  • the K-Search
  • the K-Search Engine
  • Database

inside the K-Box

  • Video Processing
  • Language Guessing
  • Resumable upload
  • Queue System
  • Supervisor
  • Web Server

Document upload and processing

  1. File is uploaded
  2. UploadCompleteEvent is raised
  3. A pipeline of actions are performed

File Upload

Drag and drop via Form upload or Resumable Upload Protocol

Upload of Large files

tus.io
https://github.com/OneOffTech/laravel-tus-upload

Document elaboration pipeline


protected $actions = [
	Actions\ExtractFileProperties::class,
	Actions\GuessLanguage::class,
	Actions\AddToSearch::class,
	Actions\GenerateThumbnail::class,
	Actions\ElaborateVideo::class,
];
				
github.com/k-box/k-box/blob/master/app/DocumentsElaboration/Kernel.php

Language Detection

Identify in which language a document is written

186 languages recognized

https://github.com/wooorm/franc

Video processing

Delivering video to the browser

We Open Source a command line tool

it uses FFmpeg and Shaka Packager

it is written in Javascript and packaged as a cross platform binary

github.com/OneOffTech/video-processing-cli/

K-Box deployment

  • Memory: 2GB RAM minimum; suggested 4 GB.
  • Minimum of 5GB of free storage space for installation.
  • CPU: x86-64 processor. H264 acceleration is optional, but recommended if the optional video streaming service is enabled.

K-Link

K-Search

defines the data that can be searchable

intermediates between the full text search and the clients/components

connect the metadat of the api with the content of the files configuration

"repositories": [
	{
		"type": "vcs",
		"url": "https://git.klink.asia/main/k-search-client-php.git"
	}
],
									
"k-box/k-search-client-php": "3.0.*",
"php-http/guzzle6-adapter": "^1.1",
"guzzlehttp/guzzle": "~6.2.0",
"ramsey/uuid": "^3.7",
						

return hash_file( 'sha512', $filePath );
return hash( 'sha512', $content );
				

use Carbon\Carbon;
use Ramsey\Uuid\Uuid;
use KSearchClient\Client;
use KSearchClient\Model\Data\Data;
use KSearchClient\Model\Data\Author;
use KSearchClient\Model\Data\Uploader;
use KSearchClient\Http\Authentication;
use KSearchClient\Model\Data\Copyright;
use KSearchClient\Model\Data\Properties;
use KSearchClient\Model\Data\CopyrightOwner;
use KSearchClient\Model\Data\CopyrightUsage;

const KSEARCH_URL = 'http://localhost:8181/';
const APP_URL = 'http://localhost:9000/data/';
const THUMBNAILS_URL = 'http://localhost:9000/thumbnails/';
const APP_TOKEN = 'A_MORE_SECURE_TOKEN';

require_once __DIR__ . '/to_data.php';

$client = Client::build(KSEARCH_URL, new Authentication(APP_TOKEN, APP_URL));

$result = $client->add($data, 'Water improved knowledge management');
				

$hash = KlinkDocumentUtils::generateDocumentHash($file);
$uuid = Uuid::uuid4()->toString();
						
$data = new Data();
$data->hash = $hash;
$data->type = 'document';
$data->url = APP_URL . $title;
$data->uuid = $uuid;
							
$author = new Author();
$author->name = 'Alessio';
$author->email = 'alessio@oneofftech.xyz';
					
$data->authors = [$author];
					
$uploader = new Uploader();
$uploader->name = 'Alessio';
$uploader->url = APP_URL;
					
$data->uploader = $uploader;
					
$data->copyright = new Copyright();
$data->copyright->owner = new CopyrightOwner();
$data->copyright->owner->name = 'Alessio';
$data->copyright->owner->website = APP_URL;
// $data->copyright->owner->email = $copyright_owner->get('email', '');
// $data->copyright->owner->address = $copyright_owner->get('address', '');

$data->copyright->usage = new CopyrightUsage();
$data->copyright->usage->short = 'C';
$data->copyright->usage->name = 'All rights reserved';
// $data->copyright->usage->reference = $usage_license->license;

$data->properties = new Properties();
$data->properties->title = $title;
$data->properties->filename = $title;
$data->properties->mime_type = $mime_type;
$data->properties->language = 'en';
$data->properties->collections = [];
$data->properties->tags = [];

$created_at = Carbon::now();
$updated_at = Carbon::now();

$data->properties->created_at = new DateTime($created_at->format('Y-m-d H:i:s.u'), $created_at->getTimezone());
$data->properties->updated_at = new DateTime($updated_at->format('Y-m-d H:i:s.u'), $updated_at->getTimezone());
$data->properties->size = filesize($file);
$data->properties->abstract = '';
$data->properties->thumbnail = THUMBNAILS_URL . md5($title) . '.png';
					
				

K-Link Registry

Sneak peak on what we are working on

What we saw

  • We covered some K-Box and K-Link internals
  • We saw some deployment strategies