From fianno at jshub.org Thu Nov 26 03:27:14 2009 From: fianno at jshub.org (Fiann O'Hagan) Date: Thu Nov 26 03:27:18 2009 Subject: [uf-new] Microformats for hidden data Message-ID: <1575cd730911260327y5cbd08c9i2b799644bf22c281@mail.gmail.com> Hi everyone, A little while ago my colleague Liam posted on this list about the jsHub project and our ideas for a microformat to replace the proprietary JavaScript currently used for web analytics metadata. He got some good feedback, and I can see there's work we need to do. Here's the use case we want to address: there is a lot of information currently stored in pages which is encoded in vendor-specific JavaScript variables. There are many reasons why the microformat approach (in principle) would be better than the current situation. Publishers of big sites find that they are now using multiple tags, and therefore it makes sense to have a single version of the data about each page rather than re-declaring it in multiple formats. I also believe that some of this information (page name and category, for example) would be of great interest to search engine spiders if it was accessible. I would like to take a step back?from comments on our specific proposal and ask a much more general question. Are there any materials currently available about information which is not in the visible HTML of the page? As far as I can see, all the microformats currently in use start with information which is visible in the page, and then add markup to indicate what it represents. For example, with hProduct, you start with the existing product name, price etc in the page, and add the appropriate classes to indicate what these fields represent. But there is a wealth of information hidden within the page in tags and in JS blocks. For example on the microformats.org wiki at http://microformats.org/wiki/hcard-faq var wgPageName = "hcard-faq"; var wgTitle = "hcard-faq"; var wgAction = "view"; It's quite possible that for web analytics purposes, you might want to use the page name "hcard-faq" which is different from both the HTML title element "hCard FAQ · Microformats Wiki" and the URL path "/wiki/hcard-faq". Is there any guidance available about these cases, where the information we want to capture is not part of the visible page? Please note that it is human readable, but the person consuming the data is different from the end user browsing the site, for example, it is someone looking at reports on the most popular pages on the site. This means that some of the microformats principles, such as visible data not invisible metadata, can't directly apply. Thanks for any feedback, Fiann From danbri at danbri.org Thu Nov 26 03:58:46 2009 From: danbri at danbri.org (Dan Brickley) Date: Thu Nov 26 03:58:54 2009 Subject: [uf-new] Microformats for hidden data In-Reply-To: <1575cd730911260327y5cbd08c9i2b799644bf22c281@mail.gmail.com> References: <1575cd730911260327y5cbd08c9i2b799644bf22c281@mail.gmail.com> Message-ID: On Thu, Nov 26, 2009 at 12:27 PM, Fiann O'Hagan wrote: > Hi everyone, > > A little while ago my colleague Liam posted on this list about the > jsHub project and our ideas for a microformat to replace the > proprietary JavaScript currently used for web analytics metadata. He > got some good feedback, and I can see there's work we need to do. > > Here's the use case we want to address: there is a lot of information > currently stored in pages which is encoded in vendor-specific > JavaScript variables. [...] > Are there any materials currently available about information which is > not in the visible HTML of the page? [...] > But there is a wealth of information hidden within the page in > tags and in JS blocks. Interesting questions. My take on microformatism is that a big part of the value has been in encouraging people to look more closely at the tags available in HTML, at their existing official meaning and at the possibilities for using them to carry more specific / tightly defined data structures. You already highlight the existence of , and I guess I'd just draw a stronger contrast between that and proprietary / random Javascript variables. There's a lot to be said for not having to run a Javascript interpreter to figure out the basic data structures encoded in a Web page. So maybe is worth some more investigation, rather than just listing it as part of the problem...? cheers, Dan From fianno at jshub.org Thu Nov 26 07:23:18 2009 From: fianno at jshub.org (Fiann O'Hagan) Date: Thu Nov 26 07:29:37 2009 Subject: [uf-new] Microformats for hidden data In-Reply-To: References: <1575cd730911260327y5cbd08c9i2b799644bf22c281@mail.gmail.com> Message-ID: <1575cd730911260723u32b7eea0ha1b36270aa06a5a6@mail.gmail.com> Thanks for the quick response Dan. > You already highlight the existence of , and I > guess I'd just draw a stronger contrast between that and proprietary / > random Javascript variables. There's a lot to be said for not having > to run a Javascript interpreter to figure out the basic data > structures encoded in a Web page. So maybe is worth some more > investigation, rather than just listing it as part of the problem...? I agree we shouldn't dismiss the meta tag. The problem as I see it is that the meta tag only allows a simple association of name=value pairs. It doesn't allow any kind of structured data. So you can have a meta tag that gives the author, as recommended in the HTML spec http://www.w3.org/TR/html401/struct/global.html#h-7.4.4 For example, to specify the author of a document, one may use the META element as follows: But if you want to use hCard to give contact details for the author, you can't, because it's an opaque string. There's additional complexity with the content of the tag being in an XML attribute rather than a text node too, which complicates the escaping required for the string and means that you cannot include any HTML in the text. As I understand it, these limitations are what led the W3C to create RDF, which is cross-linked from the meta element in the HTML spec. And the complexity of RDF, is of course what led to the rise of microformats. One final serious limitation of the meta element is that it is only valid in the head of a document, and not in the body. With more complex pages, for example tabbed layouts, and content served in via AJAX, there's a good case to associate "page" metadata with a fragment of the page rather than the entire HTML document. That's not possible unless you can define a wrapper element around the content you are concerned with. Does that make sense, or should I be looking at it again? Fiann From mail at tobyinkster.co.uk Thu Nov 26 08:41:04 2009 From: mail at tobyinkster.co.uk (Toby Inkster) Date: Thu Nov 26 08:41:11 2009 Subject: [uf-new] Microformats for hidden data In-Reply-To: <1575cd730911260723u32b7eea0ha1b36270aa06a5a6@mail.gmail.com> References: <1575cd730911260327y5cbd08c9i2b799644bf22c281@mail.gmail.com> <1575cd730911260723u32b7eea0ha1b36270aa06a5a6@mail.gmail.com> Message-ID: <1259253665.708.130.camel@ophelia2.g5n.co.uk> On Thu, 2009-11-26 at 15:23 +0000, Fiann O'Hagan wrote: > As I understand it, these limitations are what led the W3C to create > RDF, which is cross-linked from the meta element in the HTML spec. And > the complexity of RDF, is of course what led to the rise of > microformats. Have you considered using RDFa? This is a set of XHTML attributes which brings the RDF data model to XHTML. (Many parsers also support "tag soup" HTML too.) -- Toby A Inkster From fianno at jshub.org Thu Nov 26 09:33:27 2009 From: fianno at jshub.org (Fiann O'Hagan) Date: Thu Nov 26 09:33:31 2009 Subject: [uf-new] Microformats for hidden data In-Reply-To: <1259253665.708.130.camel@ophelia2.g5n.co.uk> References: <1575cd730911260327y5cbd08c9i2b799644bf22c281@mail.gmail.com> <1575cd730911260723u32b7eea0ha1b36270aa06a5a6@mail.gmail.com> <1259253665.708.130.camel@ophelia2.g5n.co.uk> Message-ID: <1575cd730911260933l786405aag6d2dba54c4183be6@mail.gmail.com> Hi Toby > Have you considered using RDFa? This is a set of XHTML attributes which > brings the RDF data model to XHTML. (Many parsers also support "tag > soup" HTML too.) My understanding of RDFa is that it's not possible to include in valid XHTML 1.0 and that in any case there are problems with serving pages with an XML mimetype rather than text/html. Do you have any real-world examples of RDFa being published? I can see you have created a parser, but I am not aware of many examples outside of the W3 site. I'm interested in RDFa but I do find the arguments in Tantek's mail from this list quite compelling http://microformats.org/discuss/mail/microformats-discuss/2006-May/004144.html I'd be interested to know if anything has changed in the last 3 years. Fiann From michael.smethurst at bbc.co.uk Thu Nov 26 09:50:55 2009 From: michael.smethurst at bbc.co.uk (Michael Smethurst) Date: Thu Nov 26 09:51:01 2009 Subject: [uf-new] Microformats for hidden data In-Reply-To: <1575cd730911260933l786405aag6d2dba54c4183be6@mail.gmail.com> Message-ID: On 26/11/2009 17:33, "Fiann O'Hagan" wrote: Hi Fiann > Hi Toby > >> Have you considered using RDFa? This is a set of XHTML attributes which >> brings the RDF data model to XHTML. (Many parsers also support "tag >> soup" HTML too.) > > My understanding of RDFa is that it's not possible to include in valid > XHTML 1.0 If you want to serve rdfa you'll need to use an rdfa doctype: It's the only change you need to make and is fully supported by w3c validators > and that in any case there are problems with serving pages > with an XML mimetype rather than text/html. There are all kinds of problems serving pages as xml but it's not required for rdfa - just keep serving as text/html > > Do you have any real-world examples of RDFa being published? The canonical example is the london gazette: http://www.london-gazette.co.uk/ We're also using a very small dash of rdfa on bbc.co.uk: http://www.bbc.co.uk/music/reviews/66gb ...with plans to add much more to bbc.co.uk/programmes in the near future > I can see > you have created a parser, but I am not aware of many examples outside > of the W3 site. > > I'm interested in RDFa but I do find the arguments in Tantek's mail > from this list quite compelling > http://microformats.org/discuss/mail/microformats-discuss/2006-May/004144.html > > I'd be interested to know if anything has changed in the last 3 years. > > Fiann > _______________________________________________ > microformats-new mailing list > microformats-new@microformats.org > http://microformats.org/mailman/listinfo/microformats-new http://www.bbc.co.uk/ This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated. If you have received it in error, please delete it from your system. Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately. Please note that the BBC monitors e-mails sent or received. Further communication will signify your consent to this. From scott at makedatamakesense.com Thu Nov 26 09:58:50 2009 From: scott at makedatamakesense.com (Scott Reynen) Date: Thu Nov 26 10:06:50 2009 Subject: [uf-new] Microformats for hidden data In-Reply-To: <1259253665.708.130.camel@ophelia2.g5n.co.uk> References: <1575cd730911260327y5cbd08c9i2b799644bf22c281@mail.gmail.com> <1575cd730911260723u32b7eea0ha1b36270aa06a5a6@mail.gmail.com> <1259253665.708.130.camel@ophelia2.g5n.co.uk> Message-ID: <4BD96DFD-0790-49B8-A905-6C8664100F45@makedatamakesense.com> On Nov 26, 2009, at 9:41 AM, Toby Inkster wrote: >> As I understand it, these limitations are what led the W3C to create >> RDF, which is cross-linked from the meta element in the HTML spec. >> And >> the complexity of RDF, is of course what led to the rise of >> microformats. This isn't very accurate. RDF was not created primarily as a response to HTML's limitations, nor microformats as a response to RDF's complexity. The two only rarely overlap on the same use case. It's generally pretty clear which tool is more appropriate for a given job. For example: > Have you considered using RDFa? I agree, this seems much more in line with RDFa than microformats. To do this in microformats, we'd need to throw out the visible data requirement, and re-interpret all of the other guidelines to no longer presume visible data. And after a lot of work, the result would end up looking a lot like RDFa. -- Scott Reynen MakeDataMakeSense.com From scott at makedatamakesense.com Thu Nov 26 10:33:55 2009 From: scott at makedatamakesense.com (Scott Reynen) Date: Thu Nov 26 10:34:02 2009 Subject: [uf-new] Microformats for hidden data In-Reply-To: <1575cd730911260933l786405aag6d2dba54c4183be6@mail.gmail.com> References: <1575cd730911260327y5cbd08c9i2b799644bf22c281@mail.gmail.com> <1575cd730911260723u32b7eea0ha1b36270aa06a5a6@mail.gmail.com> <1259253665.708.130.camel@ophelia2.g5n.co.uk> <1575cd730911260933l786405aag6d2dba54c4183be6@mail.gmail.com> Message-ID: <1DA065AA-CCD2-44FF-BC39-B755EDEC8933@makedatamakesense.com> On Nov 26, 2009, at 10:33 AM, Fiann O'Hagan wrote: > Do you have any real-world examples of RDFa being published? It seems to me the scarcity of real-world examples is a drawback inherent in what you're trying to do, not specific to RDFa. I'd note you also have no real-world examples of your own data being published in HTML. You need to use something like RDFa because you have invisible data you want to put in HTML. RDFa is complex largely because it handles invisible data in HTML. People don't widely use RDFa because they find it too complex. Calling it a "microformat" wouldn't somehow remove the complexity of publishing invisible data in a format focused on visible data. If anything, it would just make things more difficult, because the microformats community is largely composed of people who have intentionally avoided the problem you're trying to solve, many of whom believe it to be unsolvable. -- Scott Reynen MakeDataMakeSense.com From fianno at jshub.org Thu Nov 26 13:56:16 2009 From: fianno at jshub.org (Fiann O'Hagan) Date: Thu Nov 26 13:56:20 2009 Subject: [uf-new] Microformats for hidden data In-Reply-To: <4BD96DFD-0790-49B8-A905-6C8664100F45@makedatamakesense.com> References: <1575cd730911260327y5cbd08c9i2b799644bf22c281@mail.gmail.com> <1575cd730911260723u32b7eea0ha1b36270aa06a5a6@mail.gmail.com> <1259253665.708.130.camel@ophelia2.g5n.co.uk> <4BD96DFD-0790-49B8-A905-6C8664100F45@makedatamakesense.com> Message-ID: <1575cd730911261356hb079426q41117f5ab3531fae@mail.gmail.com> That's interesting Scott. I am not sure I have understood your point completely, but I'd like to explore it. > This isn't very accurate. ?RDF was not created primarily as a response to > HTML's limitations, nor microformats as a response to RDF's complexity. I agree. RDF is not about a limitation of HTML, but it is an attempt to allow data which is too complex to convey in the meta tag alone. > I agree, this seems much more in line with RDFa than microformats. ?To do > this in microformats, we'd need to throw out the visible data requirement, > and re-interpret all of the other guidelines to no longer presume visible > data. ?And after a lot of work, the result would end up looking a lot like > RDFa. Why would it end up looking like RDFa? This is the part I don't understand. RDFa looks like it does because it involves XML namespaces, namespaced values for XML attributes, and URIs. The markup indicates relations between items, where the nature of the relation is defined by resolving a URI. In contrast, microformats simply use some well known class names. If we have an element with the class of hproduct, it describes a product. Inside that, an element with the class of fn is the product name. There is no URI to dereference to understand what is meant by fn. So when you say that it would end up looking like RDFa, do you mean in terms of syntax? Or do you mean in terms of the data being applied as attributes to elements that are otherwise visible, like the about attribute being added to a div? If it's the second one, then I was imaging something much simpler, which looks like any other microformat, but with some or all of the content in a CSS display:none region of the page. That to me still looks like a microformat, not like RDFa. Fiann From tantek at cs.stanford.edu Thu Nov 26 15:40:08 2009 From: tantek at cs.stanford.edu (=?UTF-8?Q?Tantek_=C3=87elik?=) Date: Thu Nov 26 15:40:30 2009 Subject: [uf-new] Microformats for hidden data In-Reply-To: <1575cd730911261356hb079426q41117f5ab3531fae@mail.gmail.com> References: <1575cd730911260327y5cbd08c9i2b799644bf22c281@mail.gmail.com> <1575cd730911260723u32b7eea0ha1b36270aa06a5a6@mail.gmail.com> <1259253665.708.130.camel@ophelia2.g5n.co.uk> <4BD96DFD-0790-49B8-A905-6C8664100F45@makedatamakesense.com> <1575cd730911261356hb079426q41117f5ab3531fae@mail.gmail.com> Message-ID: <60cb038a0911261540x7fbe0a31wb3543b2d1091cd25@mail.gmail.com> On Thu, Nov 26, 2009 at 1:56 PM, Fiann O'Hagan wrote: >> I agree, this seems much more in line with RDFa than microformats. ?To do >> this in microformats, we'd need to throw out the visible data requirement, >> and re-interpret all of the other guidelines to no longer presume visible >> data. ?And after a lot of work, the result would end up looking a lot like >> RDFa. > > Why would it end up looking like RDFa? This is the part I don't > understand. RDFa looks like it does because it involves XML > namespaces, namespaced values for XML attributes, and URIs. The markup > indicates relations between items, where the nature of the relation is > defined by resolving a URI. Agreed. In practical use cases for marking up visible data, namespaces, bound prefixes etc. have never been necessary. It appears the desire for abstract bound (or literal) prefix namespaces (often strongly outspoken, especially on email lists - see recent mega-threads at public-html at w3.org for example) is primarily religious/dogmatic/tautological in nature rather than based on real world use cases. This is one of the key reasons why their discussion is explicitly discouraged on microformats mailing lists, and I'll leave it at that. http://microformats.org/wiki/mailing-lists#Bad_topics_for_discussion > In contrast, microformats simply use some well known class names. Well, that plus they've gone through a fairly rigorous development process: http://microformats.org/wiki/process > If we have an element with the class of hproduct, it describes a product. > Inside that, an element with the class of fn is the product name. > There is no URI to dereference to understand what is meant by fn. There is no URI that you *MUST* dereference to understand what is meant by fn, but for example if you are using hCard, you *MAY* reference the hCard profile and thus place the "fn" term in URI space: http://microformats.org/profile/hcard#fn This is allowed by microformats, because it helps with interop with some other technologies. It's optional because in the vast majority of practical uses, it's unnecessary. > ... I was imaging something much simpler, > which looks like any other microformat, but with some or all of the > content in a CSS display:none region of the page. That to me still > looks like a microformat, not like RDFa. As Scott has pointed out, it will be nearly impossible to make a microformat for invisible data given that the process has many steps requiring documentation/existence of visible data to mark up. However, what you're talking about, simple use of class names that "looks like any other microformat" is actually poshformats: http://microformats.org/wiki/poshformats It's an important distinction, take a look. For your original use case that you mentioned, invisible data for scripts/script libraries, you have a couple of choices: 1. make up a poshformat and take care to NOT call it a microformat (because it isn't one, and you never intend to try to take it through the process) 2. use the "data-*" attributes in HTML5 which were explicitly created to handle the use case of data attributes for scripts/script libraries among other things. http://microformats.org/wiki/html5#data_attributes If you're comfortable with starting to use HTML5, I recommend the latter option, because it better reflects your use case, and hopefully will have less chance of implying there is a format of data to interchange when there isn't. Thanks, Tantek -- http://tantek.com/ From scott at makedatamakesense.com Thu Nov 26 17:39:40 2009 From: scott at makedatamakesense.com (Scott Reynen) Date: Thu Nov 26 17:46:50 2009 Subject: [uf-new] Microformats for hidden data In-Reply-To: <1575cd730911261356hb079426q41117f5ab3531fae@mail.gmail.com> References: <1575cd730911260327y5cbd08c9i2b799644bf22c281@mail.gmail.com> <1575cd730911260723u32b7eea0ha1b36270aa06a5a6@mail.gmail.com> <1259253665.708.130.camel@ophelia2.g5n.co.uk> <4BD96DFD-0790-49B8-A905-6C8664100F45@makedatamakesense.com> <1575cd730911261356hb079426q41117f5ab3531fae@mail.gmail.com> Message-ID: <80B82222-317C-4C55-AE32-3C6AD9A2DDD7@makedatamakesense.com> On Nov 26, 2009, at 2:56 PM, Fiann O'Hagan wrote: > So when you say that it would end up looking like RDFa, do you mean in > terms of syntax? Or do you mean in terms of the data being applied as > attributes to elements that are otherwise visible, like the about > attribute being added to a div? I meant the general extension of HTML beyond what makes sense to most HTML authors, which is what leads, I suspect, to lower adoption. Even microformats value class pattern starts to look a little like RDFa to me, in that the meaning of the markup is not particularly clear to someone who only knows HTML semantics. > If it's the second one, then I was imaging something much simpler, > which looks like any other microformat, but with some or all of the > content in a CSS display:none region of the page. That to me still > looks like a microformat, not like RDFa. To me that looks like neither microformats nor RDFa. I think putting non-content in HTML as content goes against HTML semantics in a pretty basic way that neither RDFa nor microformats allow. On Nov 26, 2009, at 4:40 PM, Tantek ?elik wrote: > 2. use the "data-*" attributes in HTML5 which were explicitly created > to handle the use case of data attributes for scripts/script libraries > among other things. The prohibition of using data- attributes for public data seems to be a problem with this particular use case, as analytics engines are generally independent of the site being tracked and "These attributes are not intended for use by software that is independent of the site that uses the attributes." http://dev.w3.org/html5/spec/Overview.html#embedding-custom-non-visible-data I've never understood why that restriction was added, as it seems to have zero benefit, but it's still there. Personally, I'd take another look at how far you can get with tags. If the only issue with those is that they refer to the whole document, there may be a way around that, e.g. using the scheme attribute to identify a section ID. -- Scott Reynen MakeDataMakeSense.com From fianno at jshub.org Fri Nov 27 01:19:30 2009 From: fianno at jshub.org (Fiann O'Hagan) Date: Fri Nov 27 01:19:35 2009 Subject: [uf-new] Microformats for hidden data In-Reply-To: <60cb038a0911261540x7fbe0a31wb3543b2d1091cd25@mail.gmail.com> References: <1575cd730911260327y5cbd08c9i2b799644bf22c281@mail.gmail.com> <1575cd730911260723u32b7eea0ha1b36270aa06a5a6@mail.gmail.com> <1259253665.708.130.camel@ophelia2.g5n.co.uk> <4BD96DFD-0790-49B8-A905-6C8664100F45@makedatamakesense.com> <1575cd730911261356hb079426q41117f5ab3531fae@mail.gmail.com> <60cb038a0911261540x7fbe0a31wb3543b2d1091cd25@mail.gmail.com> Message-ID: <1575cd730911270119r4767473qe9948bb0d61eb33a@mail.gmail.com> Hi Tantek, > 1. make up a poshformat and take care to NOT call it a microformat > (because it isn't one, and you never intend to try to take it through > the process) No, that's not true. I still think there's a very good use case here for a real microformat, although we have not yet explained it clearly enough. Subject to community acceptance, I want to go back to square one and go through the full process. > As Scott has pointed out, it will be nearly impossible to make a > microformat for invisible data given that the process has many steps > requiring documentation/existence of visible data to mark up. I was thinking about this last night, and realised that this is critical. What precisely do you mean by "visible" in the context of microformats? On reflection, it's not the simple binary property that it might first appear. I started by thinking that it means "rendered on the screen when the page is loaded in a browser". But HTML meta tags for one do not meet this criterion. Depending on your browser, you might see something if you do File / Properties, but it's certainly not visible to a casual end user. What about tabbed content? For example on http://docs.jquery.com/Core/jQuery the examples and source code appear if you click the tab headings, but otherwise they are not displayed, even though they are in the HTML of the page if you view source. What about information that is visible only for certain media? If a site has a print stylesheet that hides the nav bar in the printed format, does that count as visible? What about if there is some information like a permalink which is displayed only when the page is printed, and not with the screen CSS? What about microformat data which is hidden from view? For example, search on Yahoo Local and you will see the business information in hCard format, in visible text. Want to know the latitude and longitude? It's in there too, but it is display:none, so you won't see it unless you have some kind of microformat parser like Operator installed in your browser. (Look for 'geo' in the source). http://uk.local.yahoo.com/Somerset/Glastonbury/glastonbury/1003476909-e-21142.html It's the final case which is most closely related to what I am proposing here. They have information which is part of the page but it is neither hidden metadata, nor is it rendered in a default view. It is intended to be visible to a different audience than the casual end user browsing the site. I want to do the same thing, which is to add information to the page, in such a way that it is accessible to humans looking at the source of the page, and to people with the right parser in their browser, but is not part of the view generally presented to end users. If what Yahoo Local are doing is not an acceptable use of microformats, then I accept that it's not the right approach for us either. But in the world of mashups, scraping, screen readers and all manner of different ways of consuming HTML, it seems like a very artificial restriction. Fiann From paul.downey at whatfettle.com Fri Nov 27 03:07:13 2009 From: paul.downey at whatfettle.com (Paul Downey) Date: Fri Nov 27 03:14:50 2009 Subject: [uf-new] Microformats for hidden data In-Reply-To: <1575cd730911270119r4767473qe9948bb0d61eb33a@mail.gmail.com> References: <1575cd730911260327y5cbd08c9i2b799644bf22c281@mail.gmail.com> <1575cd730911260723u32b7eea0ha1b36270aa06a5a6@mail.gmail.com> <1259253665.708.130.camel@ophelia2.g5n.co.uk> <4BD96DFD-0790-49B8-A905-6C8664100F45@makedatamakesense.com> <1575cd730911261356hb079426q41117f5ab3531fae@mail.gmail.com> <60cb038a0911261540x7fbe0a31wb3543b2d1091cd25@mail.gmail.com> <1575cd730911270119r4767473qe9948bb0d61eb33a@mail.gmail.com> Message-ID: <59221ac80911270307y5242879fl3880ba1cba0d57eb@mail.gmail.com> On Fri, Nov 27, 2009 at 9:19 AM, Fiann O'Hagan wrote: > What about tabbed content? For example on > http://docs.jquery.com/Core/jQuery the examples and source code appear > if you click the tab headings, but otherwise they are not displayed, > even though they are in the HTML of the page if you view source. The page is HTML with everything visible, with certain portions of the text "hidden" from view when JavaScript is enabled and I believe jQuery UI are moving to use ARIA to provide hints for screen-readers which are JavaScript aware. So I'm unclear what the issue is, or why Microformats are needed here. -- Paul (psd) http://blog.whatfettle.com From brian.suda at gmail.com Fri Nov 27 04:00:10 2009 From: brian.suda at gmail.com (Brian Suda) Date: Fri Nov 27 04:00:15 2009 Subject: [uf-new] Microformats for hidden data In-Reply-To: <1575cd730911270119r4767473qe9948bb0d61eb33a@mail.gmail.com> References: <1575cd730911260327y5cbd08c9i2b799644bf22c281@mail.gmail.com> <1575cd730911260723u32b7eea0ha1b36270aa06a5a6@mail.gmail.com> <1259253665.708.130.camel@ophelia2.g5n.co.uk> <4BD96DFD-0790-49B8-A905-6C8664100F45@makedatamakesense.com> <1575cd730911261356hb079426q41117f5ab3531fae@mail.gmail.com> <60cb038a0911261540x7fbe0a31wb3543b2d1091cd25@mail.gmail.com> <1575cd730911270119r4767473qe9948bb0d61eb33a@mail.gmail.com> Message-ID: <21e770780911270400x37f8ce0fm26ec3c9d695bff9c@mail.gmail.com> On Fri, Nov 27, 2009 at 9:19 AM, Fiann O'Hagan wrote: > I was thinking about this last night, and realised that this is > critical. What precisely do you mean by "visible" in the context of > microformats? --- the way I always think about it is, out of sight, out of mind. Files that you may link to, but are not visible in the browser on a daily basis tend to get crufty and the data drifts from reality. I used to have flat vCard files on my server, because i was not staring at them every day through the browser window i forgot to update my address, and the data was incorrect. (How many people actively are keeping their FoaF files current via a text editor? All these files are fine, but the originating source should be very visible to people for editing) The same applies to elements or hidden data in HTML. If you are not visibly looking at it every day, there is a potential for the information to be wrong. We?re not talking about if it is "visible" when you view source, you don't browse sites by viewing source after following a link. Microformats have always tried to promote that the data being encoded is "visible", as in rendered in the browser normally so people can see it (with many eyes all bugs are shallow) and if there is a problem, it gets fixed quicker due to this high visibility. > It's the final case which is most closely related to what I am > proposing here. They have information which is part of the page but it > is neither hidden metadata, nor is it rendered in a default view. It > is intended to be visible to a different audience than the casual end > user browsing the site. --- i would say that it is best to avoid display:none on your content. Yahoo! has chosen to do so, I personally wouldn't recommend it, but their audience is very different than mine. Had they shown GEO coordinates it might confuse their customers. They made a choice to hide it and take the risks of data drift. It is their call to do so. The microformats still work, but they are not promoted to be used in this way. Everything SHOULD be visible by default. > I want to do the same thing, which is to add information to the page, > in such a way that it is accessible to humans looking at the source of > the page, and to people with the right parser in their browser, but is > not part of the view generally presented to end users. --- this is where microformats are designed to work in the opposite direction. ALWAYS visible by default. You are asking for hidden by default, which is a recipe for crufty data. Microformats were always designed with people in mind first, this seems to be designing for scrapers first but accessible to people. > If what Yahoo Local are doing is not an acceptable use of > microformats, then I accept that it's not the right approach for us > either. --- I wouldn't advocate the hiding of the data, but they have their reasons and take the risk. > But in the world of mashups, scraping, screen readers and all > manner of different ways of consuming HTML, it seems like a very > artificial restriction. --- well, we can look at it in a different way. Would you trust a "smaller" set of data that has a higher probability of being accurate, or loads of hidden data that has a higher probability of being inaccurate? Search engines took the first approach. None that i know of still utilize the meta keyword element. It was hidden data, it drifted from reality, and wasn?t reliable (but it was everywhere). Instead they look at rel-tag and visible data (who knows their algorithms might even punish or ignore display:none) and use the more trustworthy data instead. Microformats have been shown to work in the real world already with consuming applications and mash-ups. I don?t think this "very artificial restriction" has been as big of a problem as you might think. I hope this explains abit more about visible and hidden. -brian -- brian suda http://suda.co.uk From martin at weborganics.co.uk Fri Nov 27 04:31:56 2009 From: martin at weborganics.co.uk (Martin McEvoy) Date: Fri Nov 27 04:32:02 2009 Subject: [uf-new] Microformats for hidden data In-Reply-To: <21e770780911270400x37f8ce0fm26ec3c9d695bff9c@mail.gmail.com> References: <1575cd730911260327y5cbd08c9i2b799644bf22c281@mail.gmail.com> <1575cd730911260723u32b7eea0ha1b36270aa06a5a6@mail.gmail.com> <1259253665.708.130.camel@ophelia2.g5n.co.uk> <4BD96DFD-0790-49B8-A905-6C8664100F45@makedatamakesense.com> <1575cd730911261356hb079426q41117f5ab3531fae@mail.gmail.com> <60cb038a0911261540x7fbe0a31wb3543b2d1091cd25@mail.gmail.com> <1575cd730911270119r4767473qe9948bb0d61eb33a@mail.gmail.com> <21e770780911270400x37f8ce0fm26ec3c9d695bff9c@mail.gmail.com> Message-ID: <4B0FC6BC.20200@weborganics.co.uk> Hello Brian, All Brian Suda wrote: ... > Search engines took the first approach. None that i know > of still utilize the meta keyword element. It was hidden data, it > drifted from reality, and wasn?t reliable (but it was everywhere). > Instead they look at rel-tag and visible data (who knows their > algorithms might even punish or ignore display:none) and use the more > trustworthy data instead. > As far as search engines go Google *may* punish you for hidden data ( particularly for links and keywords ) because as Brian said hidden data on your website is perceived as untrustworthy. Google webmaster guidelines says this (amongst other things)... "If you do find hidden text or links on your site, either remove them or, if they are relevant for your site's visitors, make them easily viewable." http://www.google.com/support/webmasters/bin/answer.py?answer=66353 So really hidden data is not good for the web in general not just microformats. Microformats are all about what you can see...and not at all about what you cant. Thanks -- Martin McEvoy WebOrganics http://weborganics.co.uk/ Add to address book: http://transformr.co.uk/hcard/http://weborganics.co.uk/ From fianno at jshub.org Fri Nov 27 06:12:50 2009 From: fianno at jshub.org (Fiann O'Hagan) Date: Fri Nov 27 06:20:33 2009 Subject: [uf-new] Microformats for hidden data In-Reply-To: <1575cd730911270538k46b52e31od823d00a0a0a03b2@mail.gmail.com> References: <1575cd730911260327y5cbd08c9i2b799644bf22c281@mail.gmail.com> <1575cd730911260723u32b7eea0ha1b36270aa06a5a6@mail.gmail.com> <1259253665.708.130.camel@ophelia2.g5n.co.uk> <4BD96DFD-0790-49B8-A905-6C8664100F45@makedatamakesense.com> <1575cd730911261356hb079426q41117f5ab3531fae@mail.gmail.com> <60cb038a0911261540x7fbe0a31wb3543b2d1091cd25@mail.gmail.com> <1575cd730911270119r4767473qe9948bb0d61eb33a@mail.gmail.com> <21e770780911270400x37f8ce0fm26ec3c9d695bff9c@mail.gmail.com> <1575cd730911270538k46b52e31od823d00a0a0a03b2@mail.gmail.com> Message-ID: <1575cd730911270612m1ea9b70eo2f0925b93cdffaf0@mail.gmail.com> Brian, thank you for the very detailed response. I do understand this better now. I completely agree about the issue of out of sight, out of mind. It's exactly the problem that applies to web analytics data now. What I want to do is to bring the data a little more out into the visible world. What typically happens on big enterprise sites is that they have an analytics product which requires certain per-page metadata, such as a page name and category. This is different from typical installations of the free tools like Google Analytics, partly because these are larger, more complex sites with deeper analytics needs, and partly because they often have horrible legacy URL structures which makes it impossible to just record visits to URLs. There is a lot of existing deployment of these tags, see for example data at?http://www.jgc.org/blog/2009/10/some-real-data-about-javascript-tagging.html Our overriding interest is in making the data available to more than one tag on the page, so that the data doesn't have be declared multiple times in different formats. It's certainly possible to do this purely in JavaScript, where the data is currently declared. But the secondary goal is make the information a bit more accessible for the people who are responsible for the content. Many of them are editors who are reasonably comfortable reading html, but won't touch JavaScript because it is "programming". Hence the interest in potentially using microformats. Right now, the editors who are responsible for populating the data, and the analysts who are the audience, commonly have no access at all to check whether it's correct or correct any issues. >?Would you trust a > "smaller" set of data that has a higher probability of being accurate, > or loads of hidden data that has a higher probability of being > inaccurate? I agree completely completely with this sentiment. But my question is, given that there is data which is already hidden, crufty and out-of-sync, can we do something to shed a little more light on it? I hope this helps explain where we're coming from. Fiann From brian.suda at gmail.com Fri Nov 27 07:34:02 2009 From: brian.suda at gmail.com (Brian Suda) Date: Fri Nov 27 07:34:06 2009 Subject: [uf-new] Microformats for hidden data In-Reply-To: <1575cd730911270612m1ea9b70eo2f0925b93cdffaf0@mail.gmail.com> References: <1575cd730911260327y5cbd08c9i2b799644bf22c281@mail.gmail.com> <1575cd730911260723u32b7eea0ha1b36270aa06a5a6@mail.gmail.com> <1259253665.708.130.camel@ophelia2.g5n.co.uk> <4BD96DFD-0790-49B8-A905-6C8664100F45@makedatamakesense.com> <1575cd730911261356hb079426q41117f5ab3531fae@mail.gmail.com> <60cb038a0911261540x7fbe0a31wb3543b2d1091cd25@mail.gmail.com> <1575cd730911270119r4767473qe9948bb0d61eb33a@mail.gmail.com> <21e770780911270400x37f8ce0fm26ec3c9d695bff9c@mail.gmail.com> <1575cd730911270538k46b52e31od823d00a0a0a03b2@mail.gmail.com> <1575cd730911270612m1ea9b70eo2f0925b93cdffaf0@mail.gmail.com> Message-ID: <21e770780911270734q58c04ba4o308cf95431e8e2f6@mail.gmail.com> On Fri, Nov 27, 2009 at 2:12 PM, Fiann O'Hagan wrote: > What typically happens on big enterprise sites is that they have an > analytics product which requires certain per-page metadata, such as a > page name and category. --- yup, I know them well. One solution would be to define your own POSH format and/or re-use something like hAtom. Then in the JS code for declaring variables for tracking you can reference the microformats, for instance: Instead of: var page = "news-index"; var campaign = "news" you could replace the declared variables with references to the visible text such as: var page = $('.entity-title'); var campaign = $('a[rel="tag"]'); In the JS you are referencing visible data. As editors change fields in the CMS, the tracking codes, campaigns, sections, and other tracking is done automatically. What you need is the mapping between the visible parts of the page and your specific tracking variables. It also depends on how much you want to connect the two and/or allow editors to be changing these values. -brian -- brian suda http://suda.co.uk From fianno at jshub.org Fri Nov 27 08:59:34 2009 From: fianno at jshub.org (Fiann O'Hagan) Date: Fri Nov 27 08:59:37 2009 Subject: [uf-new] Microformats for hidden data In-Reply-To: <21e770780911270734q58c04ba4o308cf95431e8e2f6@mail.gmail.com> References: <1575cd730911260327y5cbd08c9i2b799644bf22c281@mail.gmail.com> <1259253665.708.130.camel@ophelia2.g5n.co.uk> <4BD96DFD-0790-49B8-A905-6C8664100F45@makedatamakesense.com> <1575cd730911261356hb079426q41117f5ab3531fae@mail.gmail.com> <60cb038a0911261540x7fbe0a31wb3543b2d1091cd25@mail.gmail.com> <1575cd730911270119r4767473qe9948bb0d61eb33a@mail.gmail.com> <21e770780911270400x37f8ce0fm26ec3c9d695bff9c@mail.gmail.com> <1575cd730911270538k46b52e31od823d00a0a0a03b2@mail.gmail.com> <1575cd730911270612m1ea9b70eo2f0925b93cdffaf0@mail.gmail.com> <21e770780911270734q58c04ba4o308cf95431e8e2f6@mail.gmail.com> Message-ID: <1575cd730911270859u2acbc49sd4a33a3b5625db20@mail.gmail.com> Brian, that's exactly what I am hoping to do, you have captured it precisely. hAtom gives a lot but not all of what I am looking for (my baseline is the fields that are common to all the major web analytics products). hAtom is focussed on blog posts rather than generic website pages, and I am not sure it is an exact fit, but the core concept is very similar. The reason I am interested in using microformats is that if by using a standard, I can turn your suggestion of var page = $('.entity-title'); into hAtom format var page = $('.hfeed .hentry .entry-title'); and it will work across any site with that markup, which is much better than defining our own POSH format specific to a single site. Thanks again for all the detailed feedback and putting up with this long thread. Fiann 2009/11/27 Brian Suda : > On Fri, Nov 27, 2009 at 2:12 PM, Fiann O'Hagan wrote: >> What typically happens on big enterprise sites is that they have an >> analytics product which requires certain per-page metadata, such as a >> page name and category. > > --- yup, I know them well. One solution would be to define your own > POSH format and/or re-use something like hAtom. > > Then in the JS code for declaring variables for tracking you can > reference the microformats, for instance: > > Instead of: > var page = "news-index"; > var campaign = "news" > > you could replace the declared variables with references to the > visible text such as: > var page = $('.entity-title'); > var campaign = $('a[rel="tag"]'); > > In the JS you are referencing visible data. As editors change fields > in the CMS, the tracking codes, campaigns, sections, and other > tracking is done automatically. What you need is the mapping between > the visible parts of the page and your specific tracking variables. It > also depends on how much you want to connect the two and/or allow > editors to be changing these values. > > -brian > > -- > brian suda > http://suda.co.uk > _______________________________________________ > microformats-new mailing list > microformats-new@microformats.org > http://microformats.org/mailman/listinfo/microformats-new > From hober0 at gmail.com Fri Nov 27 09:58:23 2009 From: hober0 at gmail.com (Edward O'Connor) Date: Fri Nov 27 10:05:25 2009 Subject: [uf-new] Microformats for hidden data In-Reply-To: <80B82222-317C-4C55-AE32-3C6AD9A2DDD7@makedatamakesense.com> References: <1575cd730911260327y5cbd08c9i2b799644bf22c281@mail.gmail.com> <1575cd730911260723u32b7eea0ha1b36270aa06a5a6@mail.gmail.com> <1259253665.708.130.camel@ophelia2.g5n.co.uk> <4BD96DFD-0790-49B8-A905-6C8664100F45@makedatamakesense.com> <1575cd730911261356hb079426q41117f5ab3531fae@mail.gmail.com> <80B82222-317C-4C55-AE32-3C6AD9A2DDD7@makedatamakesense.com> Message-ID: <3b31caf90911270958lad377b0i7270687c52dc3014@mail.gmail.com> Hi Scott, Tantek suggested: >> 2. use the "data-*" attributes in HTML5 which were explicitly created >> to handle the use case of data attributes for scripts/script >> libraries among other things. You replied: > The prohibition of using data- attributes for public data seems to be > a problem with this particular use case, as analytics engines are > generally independent of the site being tracked and "These attributes > are not intended for use by software that is independent of the site > that uses the attributes." I believe you misunderstand the restriction on data-*="" attributes. Here's how James Graham explained the restriction recently on the HTML WG mailing list[1]: > [A] third party js-library is not considered independent of the site > (since the site must decide to import the js-library into its pages). [...] > To take a slightly different example, it is OK to have data-marquee > that is used by a script that the author includes in the page to > implement marquee effects. But it is not permitted for a user agent to > provide its own marquee effects based on the presence of a marquee > attribute. -- Edward O'Connor hober0@gmail.com 1. http://lists.w3.org/Archives/Public/public-html/2009Oct/0630.html