For the parenthesis issue, just use a decent editor like vim or emacs and it is a non-issue.
Tangent: I was on the Muni the other day working w/ Scheme on my laptop, and this guy sitting next to me turns out to a CS master's student. We talk a little about lisp, and it turns out he doesn't like it, because he couldn't handle having to balance the )'s.
I was like, "counting? the editor does that for you..."
Turns out he was programming with notepad. I was blown away that someone working on a CS masters would use notepad for lisp (or any other serious programming task).
It is true that the html library in standard arc has some inconsistencies. But making a set of macros that generates html the way that you want is super-easy. I would guess that it is easier to write your own set of macros to do it than to read the documentation on how someone else's works.
But if you don't get the standard library right, then everyone is going to start using their personal version of it, and then you lose portability, etc.
It is worth getting the language right, even if the fix is trivial.
Basically 'marcup would accept a list representing an abstract syntax tree for the HTML code to be generated.
Then I decided to implement w/html instead, which was almost as good ^^; >.< In fact it's arguably better, since the copious ' marks denote non-Arc code (i.e. HTML tags), as opposed to the marcup style above where , marks denote Arc code.
Thanks for the tip! I'l add that to the to-investigate list. I am trying to cram as much info into the page as possible, and it is hard to hit the right balance.
Here is the site that I have been working on--a sports news aggregator. For people who think that sports news should be on the front page of the paper.
My current major objection is the size of the headlines, which I think should be larger. Alternatively make the summaries shorter. How do you generate the summaries?
Also, the bottommost part appears too ragged for me. Some amount of measurement may be possible.
A final suggestion: perhaps make the highest-ranked 1 or 2 headlines near the top of the page in even larger text (in addition to the headline+summary currently on the page.). The top headlines in large text should maybe be just headlines, no summary, but put an anchor link to the headline+summary version in the rest of the page. Basically, something a little like the banner headline of the front page of a newspaper.
Thanks for the feedback. The summaries are put in manually when the comment is posted, so nothing fancy. In practice one just has to remember to grab some text before using the bookmarklet (which fills in the rest).
A lot of people have said that the ragged bottom takes away a bit from the look and feel. There doesn't seem to be an easy way to fix that. The best I can come up with is to try and estimate the line height of each story, and if it exceeds a certain amount then truncate it, but only if it is on a page w/ other stories.
I'll play around with the top-top headlines idea. Is it inline with the look and feel of newspapers.
Thanks for the suggestions!
-----
Edit: The headlines are now bigger. It is better, thanks!
I moved them from 1em to 1.25em after trying it via firebug.
Regarding summaries, I think it would be possible to actually pull some text using Arc (although currently there seems to be no decent way to open a client connection, and certainly there don't seem to be any libraries for client-side HTTP). Of course the summarization would have to be done too. Hmm.
What I could do is if you use the bookmarklet, then it will pull in a best guess as to what the summary is and put it into the box. Then the user can edit that if they want. That wouldn't be too hard.
With ajax I could do the same thing once the user fills in the url.
IMO the hard part is the "best guess". ^^ I've been looking for papers about summarization and haven't found much. Hmm. Maybe look at the title and try to fetch words around words in the title, i.e. use the title's terms as search terms.
There is an easy way and a hard way to do it. For most articles, if you take the first element in the DOM that is a paragraph and contains above a certain number of words, then I am guessing that would most times be the leader paragraph.
The second easy way is to do the above most of the time, but have some site-specific things that are used instead.
You could also use some classifying software to ID the proper paragraph. You could have a training set of all the descriptions that have been on the site before, and find text that most matches that text, and use that. Or find the first bit of text that matches beyond a certain threshold, and use that.
The hardest way is to automatically generate a summary. I work in the automated document analysis business, and this is indeed pretty hard to do.
Using MzS 360, no problems at all with uptime. The only problem I had porting was from Arc2 to Anarki Arc2.
For persistence you can probably get a lot of mileage out of making some small changes to the news.arc model layer. I found making changes when necessary was really trivial.
As for advice, I think you should not worry about when/if Arc3 is going out, and just proceed.
You're probably right about not waiting. I probably won't have time until after this weekend anyway, so we'll see if we get an Arc3 gift :)
Good to hear about MzS 360, that's the package that installed on my local Ubuntu 7.10 system.
I just installed it on an older Ubuntu server that I'll be using for production and got 352, so that should be pretty safe given that's what pg is using.