Web development edit (Category edit)
Web design edit (Category edit)
Contents |
http://en.wikipedia.org/wiki/MHTML.
MHTML stands for MIME HTML. It is used to bind resources which are typically represented by external links, such as image and sound files, along with HTML code into a single file. The key to MHTML is that the content is encoded as if it were an HTML email message, using the MIME type multipart/related. The first part is the HTML file, encoded normally. Subsequent parts are additional resources, identified by their original URLs. This format is sometimes referred to as MHT, after the suffix .mht given to such files by default when created by Microsoft Word, Internet Explorer or Opera. MHTML is a proposed standard, circulated in a revised edition in 1999 as RFC 2557. ... Web browser support Outside of Internet Explorer, support for this format is variable. The process for saving a web page along with its resources as an MHTML file is not standardized across those browsers that do provide support. Due to this, the same web page saved as an MHTML file using different browsers may render differently on each browser. Internet Explorer The .mht format was introduced in 1999 with Internet Explorer 5. Saving in this format allows users to save a web page and its resources as a single MHTML file called a "Web Archive", where all images and linked files will be saved as a single entity. It may, however, be unable to correctly save certain complex web pages, especially those containing scripts. ... Firefox The web browser Mozilla Firefox, as of version 2, does not include direct support for saving or opening web pages as MHTML files. Though there is already source code available for viewing MHTML files within the Thunderbird project, this is filed as unsolved bug 18764 within the Firefox project since 1999. There is a MHTML extension: UnMHT. This works in Firefox 2.0. It is available under the MPL 1.1/GPL 2.0/LGPL 2.1 licence.
For Firefox 1.5, this functionality can be obtained on Windows and Linux operating systems by installing a freely available third-party XPI file from Mozilla Archive Format extension, though the .mht files it generates are not fully compatible with Microsoft's products.
- From UnMHT version 2.8.0 English and German locales are included.
- As of version 3.0.0 extracting data to a MHTML file is supported.
http://en.wikipedia.org/wiki/Data:_URI_scheme.
The data: URI scheme defined in IETF standard RFC 2397 is a URI scheme that allows inclusion of small data items inline, as if they were being referenced to as an external resource. They tend to be far simpler than alternative inclusion methods, such as MIME with cid: or mid:. According to the wording in the RFC, data: URIs are in fact URLs, although they do not actually locate anything. data: URIs are currently supported by:
- Gecko and its derivatives, such as Firefox
- ...
http://wiki.whatwg.org/wiki/FAQ.
What is the WHATWG?
The WHATWG is a growing community of people interested in evolving the Web. It focuses primarily on the development of HTML and APIs needed for Web applications.
The WHATWG was founded by individuals of Apple, the Mozilla Foundation, and Opera Software in 2004, after a W3C workshop. Apple, Mozilla and Opera were becoming increasingly concerned about the W3C’s direction with XHTML, lack of interest in HTML and apparent disregard for the needs of real-world authors. So, in response, these organisations set out with a mission to address these concerns and the Web Hypertext Application Technology Working Group was born.
What does the acronym WHATWG stand for?
It stands for "Web Hypertext Application Technology Working Group".
What is the WHATWG working on?
The WHATWG is working on HTML 5 (see below). In the past it has worked on Web Forms 2.0 and Web Controls 1.0 as well. Web Forms 2.0 (see below) has reached a stable stage and we're awaiting implementation experience. Web Controls 1.0 has been abandoned for now, awaiting what XBL 2.0 will bring us.
What is HTML 5?
HTML 5 is the main focus of the WHATWG community and also that of the (new) W3C HTML Working Group. HTML 5 is a new version of HTML 4.01 and XHTML 1.0 addressing many of the issues of those specifications while at the same time enhancing (X)HTML to more adequately address Web applications. Besides defining a markup language that can be written in both HTML (HTML5) and XML (XHTML5) it also defines many APIs that form the basis of the Web architecture. These APIs are known to some as "DOM Level 0" and have never been documented. Yet they are extremely important for browser vendors to support existing Web content and for authors to be able to build Web applications.
The XML Binding Language (XBL) describes the ability to associate elements in a document with script, event handlers, CSS, and more complex content models, which can be stored in another document. This can be used to re-order and wrap content so that, for instance, simple HTML or XHTML markup can have complex CSS styles applied without requiring that the markup be polluted with multiple semantically neutral div elements. It can also be used to implement new DOM interfaces, and, in conjunction with other specifications, enables arbitrary tag sets to be implemented as widgets. For example, XBL could be used to implement the form controls in XForms or HTML.
--) in your comment!Timothy L. Warner (<2005-12-01). Mother Tongue Annoyances » Difference Between Em Dash and En Dash (http://www.mtannoyances.com/?p=218).
Many people who attempt to put comments into their HTML code -- text intended for the reference of the developer but not to be displayed by the browser -- get the syntax wrong, which can cause some browsers to mistakenly display part of the comment, or even worse, consider big chunks of the rest of the document to be a comment and ignore them. This widespread confusion about comment syntax is understandable given that HTML comment syntax is a rather inscrutable outgrowth of SGML (Standard Generalized Markup Language), a language devised for the formatting of government documents which formed the basis of HTML. Maybe some of the same people who devised the 1040 tax forms also created the syntax rules for SGML, and hence HTML. While the popular conception is that comments open with<!-- and end with -->, this isn't quite completely accurate. Actually, comments start and end with "--", as in "-- This is a comment --", but such comments can only occur within the proper SGML context, which happens to be a block starting with <! and ending with >. This ends up producing the commonly observed comment syntax, but it requires the additional condition that you shouldn't have "--" occur in the middle of the comment, because that would mark the end of the comment. Actually, you can use "--" if it's followed with another "--", since multiple comments are allowed. So the following is legal:<!-- This is a comment -- -- and here's more -->But since you don't want to have to be so careful in counting your pairs of dashes, it's better not to include any double dashes anywhere in a comment line, so you can be sure the proper syntax is followed. This can get to be a bit of a problem in commenting out JavaScript code (recommended to hide it from older browsers), since "--" is frequently encountered as a decrement operator.
You'll have to use your best judgment in such cases about whether to rewrite your JavaScript code to avoid this operator, or live with a malformed "comment" that probably won't crash the common browsers which are used to dealing with such bad syntax anyway. Just try not to use comments like
2007-06-19 15:54<!-------------->with so many dashes you're likely to lose count.My comment:Actually if it's JavaScript you're trying to comment out, there are much better ways, such as the /* */ multi-line JavaScript comment, or enclosing the code in
if(false) { }.
2005-10-13. HTML Comments - Anne’s Weblog (http://annevankesteren.nl/2005/10/comments).
... A comment start is<!. A comment end is>. No variations. In between you have dashes. Ugh. Dashes are --. Inside a pair of matching dashes a comment end must be treated as a literal and therefore Acid 2 works as it does. Fun, not? [...] Per current specifications dashes have to be directly adjacent to a comment start; no whitespace in between. Browsers support whitespace in between. [...] Anyway, what happens with<!-- -- --> <!-- -->? When the parser reaches the end there is a problem. Dashes are missing. The last > is treated as part of the comment. The browser needs to reparse this. And now it gets interesting. The first < is to be treated as a literal, which makes it essentially, and things that follow it, a text node. That leaves us with a text node<!-- -- -->and a comment containing a space. That is about it. I hope you find it as confusing as I do. I do get it though. And it sort of makes sense. ...
HTML and SGML comments (http://www.howtocreate.co.uk/SGMLComments.html).
This article is now being made obsolete
Due to the problems pointed out by this article, SGML comments have been removed from Acid 2, and future HTML versions will not require SGML comments. Browsers that have implemented them are now expected to remove their support for SGML comments, for all HTML versions.
...
Enter the double dash
Suddenly it is not so easy any more. You see, browsers were wrong. HTML was created as a subset of SGML, and SGML dictates a more complicated view of comments. Browsers all ignored SGML comments though, and stuck with the comment format they had always used. This was a sensible approach, in my opinion.
For a little while, Opera experimented with "correct" comment handling, and found that predictably, no Web page authors were aware of it, resulting in a lot of broken sites. So Opera changed back to using the format that everyone understood. Then Mozilla decided to implement them "properly" as well. It was implemented only in strict mode, but that did not stop it causing problems. Then the Acid 2 test came along, and for debatable reasons, they decided to include SGML comments in it.
It would have been better to rewrite the HTML standard to reflect the reality of what authors were using, but no. So browser vendors are now forced to implement SGML comments, or risk embarrassment, even though they will cause Web pages to break. Why will they break?
To put it simply, the double dash at the start and end of the comment do not start and end the comment. Double dash indicates a change in what the comment is allowed to contain. The first -- starts the comment, and tells the browser that the comment is allowed to contain > characters without ending the comment. The second -- does not end the comment. It tells the browser that if it encounters a > character, it must then end the comment. If another -- is added, then it goes back to allowing the > characters:
<!-- this can contain > characters -- this can not, so the comment ends here>Each time a double dash is encountered, it changes the format between allowing, and not allowing the > characters to be inside the comment:
<!-- this can contain > characters -- this can not -- this can contain > characters -- this can not, so the comment ends here>That example is not actually valid HTML, since the last part (between the last -- and the closing >) is not allowed to contain anything except whitespace. However, the SGML parsing rules will cause it to behave as described, even if there are some other non-whitespace characters in there:
<!-- this can contain > characters -- this can not -- this can contain > characters -->Note, XML (and therefore also XHTML when served using an XML based content-type) took the sensible step of making it not valid to have -- inside a comment. As a result, trying to use it should result in a parsing error. Because of this, XML and XHTML do not have the SGML comment problem. In practice, I have never seen any real need for SGML comments, so I favour the XML approach. Note that XHTML, if served using the text/html content-type, will be treated as HTML, so the SGML comment parsing rules will be applied.
Let's say you want your web application to print out some state information in a comment on every page to make it easier to debug if there is ever a problem (while you are browsing the site). For example, you want to dump everything in the visitor's session. Let's also say that you deploy this into production including the debug code -- this debug output that is supposed to be invisible on the page and only discoverable if someone does View Source.
Well, that data could easily contain a '--' in the middle of it. And if it did, anything "inside your comment" (you thought) occurring after that '--' would display on the screen (because it would technically be not inside your comment at that point) ... which could be really ugly to anyone using your application.
In Ruby, you could implement a method like this:
class String
def safe_in_comment
gsub('-', '-')
end
end
Then in your views, you can feel safe to put dumps in comments like this:
<!-- Session = <%= session.inspect.safe_in_comment %> -->
http://www.danshort.com/HTMLentities/ HTML Entities
http://accessify.com/tools-and-wizards/developer-tools/quick-escape/
Original input (please view wiki source):
<nowiki>
<pre class="showimportantbits"><code><span class="moreimp"><!--</span>
<script type="text/javascript">
for( var i = 10; i > 0; i<span class="moreimp">--</span> ) {
if( myar[i].status <span class="moreimp">></span> 3 ) {
ntlp++;
}
}
</script>
--></code>
</nowiki>
Output direct from conversion script (please view wiki source):
<pre class=\"showimportantbits\"><code><span class=\"moreimp\"><!--</span>
<script type=\"text/javascript\">
for( var i = 10; i > 0; i<span class=\"moreimp\">--</span> ) {
if( myar[i].status <span class=\"moreimp\">></span> 3 ) {
ntlp++;
}
}
</script>
--></code></pre>
Which, when used as the contents of a pre tag in MediaWiki, at least, is rendered as this (please view wiki source):
<nowiki>
<pre class=\"showimportantbits\"><code><span class=\"moreimp\"><!--</span>
<script type=\"text/javascript\">
for( var i = 10; i > 0; i<span class=\"moreimp\">--</span> ) {
if( myar[i].status <span class=\"moreimp\">></span> 3 ) {
ntlp++;
}
}
</script>
--></code>
</nowiki>
Desired output (see http://www.howtocreate.co.uk/SGMLComments.html) (don't view source -- is rendered correctly by MediaWiki):
<!--
<script type="text/javascript">
for( var i = 10; i > 0; i-- ) {
if( myar[i].status > 3 ) {
ntlp++;
}
}
</script>
-->
Problems:
" to " , but it converted it to \" .
> from my original source (yes, it's "already escaped") ended up being "unescaped" after I'd passed it through this filter , resulting in this: >. But I expected and desired for it to be "double escaped": &gt;http://www.cs.tut.fi/~jkorpela/html/nobr.html#zwsp.
It seems that in general the only cure is to use the nonstandard, Netscape-invented (!) nobr markup. It has never been adequately defined, and browsers generally treat it in a command-like fashion: <nobr> is taken as "disallow line breaks from now on" and </nobr> says "line breaks allowed from now on". But it is safest to use it as text-level markup only. This should suffice, since we normally would use nobr for short pieces of text only, as in <nobr>vis-a-vis</nobr> or <nobr>-a</nobr>.
A set of radio buttons that you want one per line, for example.
It's usually more convenient in the short term, if nothing else, to just put a br at the end... less work than setting up an unordered list, for example.
It would probably be more semantic/structured, however, to go ahead and wrap it in a tag rather than using a tag as a separator. (One of the slides on [1] talked about how <p> is more semantic than <br/>...)
http://www.xml.com/pub/a/2003/03/19/dive-into-xml.html.
... The current MIME type situation is a bit of a mess. According to the W3C's Note on XHTML Media Types:
* Although the spec is not finalized yet, all indications are that XHTML 2.0 must not be served as text/html. So the first step on the road to XHTML 2.0 is conquering the XHTML MIME type, application/xhtml+xml. ...
- HTML 4 should be served as text/html. This is what everybody does, so no problem there.
- "HTML compatible" XHTML (as defined in appendix C of the XHTML 1.0 specification) may be served as text/html, but it should be served as application/xhtml+xml. This is probably the sort of XHTML you're writing now, so you could go either way.
- XHTML 1.1 should not be served as text/html.
http://www.xml.com/pub/a/2003/03/19/dive-into-xml.html. [MIME types (category)]
.... The current MIME type situation is a bit of a mess. According to the W3C's Note on XHTML Media Types: * HTML 4 should be served as text/html. This is what everybody does, so no problem there. * "HTML compatible" XHTML (as defined in appendix C of the XHTML 1.0 specification) may be served as text/html, but it should be served as application/xhtml+xml. This is probably the sort of XHTML you're writing now, so you could go either way. * XHTML 1.1 should not be served as text/html. * Although the spec is not finalized yet, all indications are that XHTML 2.0 must not be served as text/html. So the first step on the road to XHTML 2.0 is conquering the XHTML MIME type, application/xhtml+xml. ...
Aliases: HTML