JavaScript Websockets implementation and future outlook

1.0 Introduction

 Many web-developers have been using JSON since 2000 or even before that, when the term JSON () was yet to be coined. The reason why data could be structured JSON-conform, before JSON's formal definition as a web technology standard and its request for RFC is straightforward. Objects in JavaScript were the most concise when written in Javascript's structured data formation, which did not allow for too much leeway. Nonetheless it was a powerful but rather obscure way of passing javascript objects between servers and using the "old" javascript function eval to instantly compile the Object.
AJAX stands for asynchronous JavaScript and XML, defined back in the days when it seemed likely that XML was the technology for standard inter-operability and universal data exchange. Before AJAX,whose rise was owed to browser native implementation of the function 'XMLHttpRequest', asynchronous javascript applications were possible nonetheless. This was and still is possible by utilizing the function write of the DOM document Object to write another script tag with the attribute src, pointing to a dynamic server side script. The browser would then load and parse the script which was just attached to the document-tree, and contained a dynamically generated javascript. This dynamic generation of javascript-like code on a server (e.g. JSON) is not unlike in AJAX, depending on the GET or POST parameter variables passed either with the URL or by a COOKIE / SessionId.
Since web sites are progressively built up in the browser, - as one of the core principles of the DOM and browser rendering -, such document.write's have been possible for well over a decade now. A possible AJAX implementation before its time would have looked like this:

//CLIENT:
....
function scriptLoad(){
var scriptvariable = serverContent;
//delete the script Object from the global namespace
delete serverContent;
//do all that needs to be done
....
alert("new script arrived!");

}

document.write('<script type="text\/javascript" src="scriptAPI.php?op=retrieve" ' +
'name="scriptDyn" onload='scriptLoad'>');

//SERVER <scriptAPI.php?op=retrieve>:
SWITCH($op):
....
case $op=='retrieve': var serverContent = { var1: "", var2:"",...}
break;
....
ENDSWITCH;

//Problem set: this is not possible!
var scripts = document.getElementsByName("scriptDyn");
//scripts[scripts.length].innerHTML / .value / .script.....

An asynchronous javascript chat application,as part of a web-framework, implemented by me in 2001 and extended in 2003
XMLHTTPRequest is initiating the request through the method open, which takes many parameters, defining the source to be requested, and the parameters to be passed in the HTTP headers, but only the first two are necessary.
Examples follow:

//a simple log function can look like this
function log(logMsg) {
var xhr = new XMLHttpRequest();
xhr.open("POST", "log.php?op=wb&id=sadfs75fsd8f65s78afi");
xhr.setRequestHeader("Content-Type", "text/plain;charset=UTF-8");
client.send(logMsg);
}



/*
XMLHttpRequest object :

Allows to POST or GET data and processes it as DOM, thanks to its methods and attributes.
Attributes:
readyState: code successively changes value from 0 to 4 that means for "ready".
status: 200 is OK, 404 if the page is not found.
responseText: holds loaded data as a string of characters.
responseXml: holds an XML loaded file, DOM's method allows to extract data.
onreadystatechange: property that takes a function as value that is invoked when the readystatechange event is dispatched.

Methods:
open(method,url,async,uname,pswd)
mode: type of request, GET or POST
url: the location of the file, with a path.
async: true (asynchronous) / false (synchronous).
uname, a login and a...
pswd: password may be added to arguments.
send("string") null for a GET command.

Synchronous example:

var request = new XMLHttpRequest();
request.open('GET', 'http://www.url.org/', false);
request.send(null);

if (request.status == 200)
console.log(request.responseText);
//cecks the status code after the transaction is completed. If the result is 200 -- HTTP's "OK" result

*/
function xhrLoad(){
var formInput = document.getElementById("loadurl");
var xhr = new XMLHttpRequest();
xhr.open("GET", formInput.value, true); //true...asynchronous mode, i.e. AJAX



xhr.onreadystatechange = function() { // callback for the response handling
if (xhr.readyState == 4) {
// Received, OK
if(xhr.status == 200) {
//console.log(xhr.responseText)
//set text:
document.getElementById("words").innerText = xhr.responseText;
} else{
console.log('Error', xhr.statusText);
}
} else {
// Error
}
};
xhr.send(null); //send headers only, to actually initiate the request
}
Prior to XMLHttpRequest, direct access to the script's contents, which was asynchronously loaded by the document.write method, was not possible. Workarounds did exist however by using hidden elements with a src-attribute which allowed access to the content by complicated means,for instance images.
Document.write is considered now obsolete in most cases owing firstly to its security concerns (there are few restrictions), secondly the strict enforcement of DOM methods like appendChild and removeChild by many successful web toolkits and frameworks pushed the popularity of document.write drastically, and thirdly document.write may have an unforseeable timing of appending the content to the document, or may even replace the document entirely.
The most obvious difference is that in case of document write a variable declaration or function declaration has to occur within the script that is passed, causing security concerns, but not so with XMLHttpRequest. XMLHttpRequest allows direct access to the Payload, which is usally a JSON-conforming text, and can even parse the Payload in case of an XML text, and return it as an DOM object.
The fastest way to get the serialized content back into a javascript object, was through the method eval(). Using eval("javascript") for parsing JSON content is a security risk, since this allows all access to the global javascript scope, with easy code injection. As such this concept was quickly replaced by parsers that evaluated the javascript content on top of javascript methods whilst checking validity of the JSON standard. Browsers soon introduced another native Object, aptly called JSON, with two standardized methods: stringify and parse. Interestingly as the name suggests XMLHTTPRequest have been intended as a convenient method for transferring XML based data, XHTML included, which is accessible as a DOM tree exposed through the responseXML attribute. It was not forseen then that its primary use would be the transfer of Text containing JSON-structured data, which is accessible through the aptly called attribute responseText.
The asynchronous part stems from the fact that a callback function has to be bound to the event 'onreadystatechange', which is executed once the server has responded.

var obj = {
var1: 1001,
var2: "message,
varO: {1:"test" ,2:"me","foo":"bar"}
}
var jsonTxt = JSON.stringify();
//shows serialized JSON
console.log(jsonTxt + " ; typof = " + typeof jsonTxt );
//unserialized
var jsonObj = JSON.parse(jsonTxt)
console.log( jsonObj + " ; typof = " + typeof jsonObj )
To conclude the chapter on JSON, be aware using JSON as a Serializer interface is quite limited in that no functions and complex objects (e.g. cyclic references) can be serialized, unlike powerful Serializers, for instance Pickle in Python.

  2. Websockets

 In terms of speed not much changed with the use of XMLHttpRequest, a method which the Internet Explorer implemented as one of the first browsers, over ten years ago. In those ten years XML has become less important, or at least competes against JSON. A fundamental problem with XML is the extreme overhead. A 200Mb mzxml ( an XML based format for defining mass spectrometry data) can easily be compressed to 6Mb (Arc, maximum - compression). This extreme overhead in itself causes the data pipelines of the PC or server (storage, memory and processor caches and registers) to be unnecessarily burdened, but consumes quite a bit of processing power due to parsing. Also due to complex nature of the format, it is far less accessible in many situations. Take data streaming of a 1Gb mzXML file in a simple spectra viewer for instance, although implementing a concept like that is everything but rocket science. The situation is even worse when it comes to XML based RPC (Remote Procedure Calls) SOAP and WSDL (Web Services Description Language) . Indeed I find many developers, myself included using it less end less, because more services are being accessible by alternative means such as JSON. When it comes to XML, a lot of overhead is unavoidable. The cost of this is acceptable in its original concept, the hypertext markup language - HTML. The markup code makes direct editing much easier, as the code can be quickly navigated, and the browser can parse it into a Document Object Model with each element instancing an HTMLElement. This makes browser user interfaces easily the most powerful, but also slowest of all the options. Each element comes with dozens, up to a hundred of different methods and attributes, notwithstanding those defined by user javascript libraries. All user interface elements are under the command of a powerful DOM-hierarchy and embedded in a JIT javascript interpreter / compiler. This makes RIA (Rapid internet application) development unparalleled. The drawback is extreme overhead: Typically web applications have a 10 fold memory footprint over classical UI's. However average processing speed and memory size along with increasing Browser speds seems to finally be a match for Internet applications. AJAX too is a patch work, utilizing XMLHttpRequest as a convenient workaround. The traffic is sent over the HTTP protocol, most commonly as either GET or POST requests. The problem lies in unecessary header overhead, with cookies etc.. Here is an example of an already cleaned up HTTP-HEADER generated on an Apache - php server, with the PHP 5 function header_remove('X-Powered-By:'),....

Request URL:http://localhost:8888/rpc_handler.php
Request Method:POST
Status Code:200 OK
Request Headersview parsed
POST /rpc_handler.php HTTP/1.1

Host: localhost:8888

Connection: keep-alive

Content-Length: 45

Origin: http://localhost:8888

X-Requested-With: XMLHttpRequest

User-Agent: Mozilla/5.0 (Windows NT 6.1) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/14.0.835.35 Safari/535.1

Content-Type: application/json

Accept: */*

Referer: http://localhost:8888/dojo-release-1.6.1/mytest.html

Accept-Encoding: gzip,deflate,sdch

Accept-Language: en-US,en;q=0.8

Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3

Cookie: navState=10
]]>
Additionally to the header overhead. HTTP has been devised as a discrete request protocol with metatext-browsing in mind, over two decades ago, and has been astonishingly rigid, which in light of its widespread distributing is understandable. As such the support for requesting only partial content (like the FTP resume) has been implemented at a very late stage. For this so called HTTP Range request see here. A table of all HTTP header fields is listed on wikipedia. Range: bytes=500-999 A range request would look like this in PHP

$filesize = filesize($file);

// Send standard headers
header("Content-Type: $contenttype");
header("Content-Length: $filesize");
header('Content-Disposition: attachment; filename="'.basename($file).'"');
header('Accept-Ranges: bytes');
//Requesting the partial file
header('HTTP/1.1 206 Partial Content');

//allowed range-formats: "bytes=x-y", "bytes=-x", "bytes=x-", "bytes=x-y,a-b"
header("Content-Range: bytes $start-$end/$filesize");
in Javascript setting range headers would allow to discretely "sample" a very large remote file, gradually combining the data to obtain the complete file. For instance to get a coarse spectrum over the entire range.

//the total filesize can be obtained via the a HTTP-HEAD request
//returning only the headers, without the content
var xhr = new XMLHttpRequest();
var filesize = -1;
xhr.open('HEAD', 'http://massspec.com/myMZXML.xml', true);
xhr.onreadystatechange = function(){
if ( xhr.readyState == 4 ) {
if ( xhr.status == 200 ) {
filesize = parseInt(xhr.getResponseHeader('Content-Length'));
console.log('Size [bytes]: ' + filesize);
} else {
alert('ERROR');
}
}
};
//perform request
xhr.send();

//a simple sample function
function sampleData(startBy, endBy) {
var xhr = new XMLHttpRequest();
xhr.open("GET", "http://massspec.com/myMzXML");
xhr.setRequestHeader("Content-Type", "text/xml;charset=UTF-8");
....//same as in the php example
xhr.setRequestHeader("Content-Range", "bytes " + parseInt(startBy) +"-"+ parseInt(endBy/filesize));
....
client.send(null);
}
Increasingly more and more demanding applications push for the web as a new platform. In such cases it is often necessary to have a dedicated connection to a server which can process requests in as close as realtime as possible. This means avoiding unnecessary overhead in terms of processing, number of requests (e.g. buffering) and protocol overhead (e.g. handshake simplicity). In most cases of AJAX requests a server side script is invoked, involving hundreds of function calls, and a protocol layered on TCP, that is tailored towards discrete resource requests.

  2.1 Limiting overhead
A straigtfoward way to limit unecessary HTTP requests, and thus unburden the server is by merging all javascript resources into one file. This can be done dynamically on the server side and then cached. All images (abstract design graphics) which the site requires, but not pictures (photos) can be merged into one image-sprite. The individual images are then displayed discretely by defining the image proportions in a for instance a div element, and the background image property as well as its position is set:

<div id="mySprite">
</div>
<style>
#mySprite {
width:32px;
height:32px;
background: url(../images/sprite.png) no-repeat top left;
background-position: 0 -52px; //x y; in this case -52px down from the top
}
</style>

<!--Binary data can be written directly as a base64 encoded text-->
<img src="data:image/gif;base64,##############...">

Base64 encoded data can also be written directly into the HTML file, but may unecessarily burden the DOM and memory resources (memory consumption is likely to be at least twofold for such a file-resource).

  2.2 Implementing websockets
 It is necessary! to allow developers using internet browsers direct communication on top of the TCP stack with a server. This has been requested for a long time, and is part of the HTML5 specification. Many browsers have implemented it partially, but deactiated the feature per default. This implementation is known as WebSockets. Mozilla Firefox was one of the first to implement this in the Object WebSockets, recently this was changed to MozWebSockets.
Here is an example of how to implement these



// let Firefox share the fun
if (window.MozWebSocket) {
window.WebSocket = window.MozWebSocket;
}


if ("WebSocket" in window) {
var ws = new WebSocket("ws://localhost:12345/websocket/server.php");
ws.onopen = function() {
// Web Socket is connected. You can send data by send() method.
ws.send("message to send"); ....
};
ws.onmessage = function (evt) { var received_msg = evt.data; alert(received_msg); };
ws.onclose = function() { };
} else {
// the browser doesn't support WebSocket.
}
A C# based example can be downloaded here

. This will not work with current implementations (e.g. WebSockets Draft 76) http://www.codeproject.com/KB/webservices/c_sharp_web_socket_server.aspx

 The current draft can be viewed here: http://tools.ietf.org/html/draft-ietf-hybi-thewebsocketprotocol-02 Opera Websocket check http://testsuites.opera.com/websockets/

 Enable it by clicking here: opera:config#UserPrefs|EnableWebSockets or download a specific websuite testsuites.opera.com/websockets/ -
LihatTutupKomentar