Good Reason

It's okay to be wrong. It's not okay to stay wrong.

Category: computing (page 3 of 6)

Mac people, PC people

Forget “dog” v “cat” people, “left” v “right”, or “tops” v “bottoms”. The real split in personality is “PC person” v “Mac Person”. Hunch.com (via Mashable) gathered a lot of random data about people, and saw how it correlated with their sense of “Mac” or “PC” identification.

A few details are striking.

First off, the prestige of the Mac is evident by the fact that 25% of respondents self-identified as “Mac people”, even though Mac market share is around 10% (When’d that happen?). Either a lot of people are fibbing, or they’re using both, and like Mac better.

Also noticeable:

Mac people are 50% more likely than PC people to say they frequently throw parties.
PC people are 38% more likely than Mac people to say they have a stronger aptitude for mathematical concepts.
Mac people are 95% more likely to prefer indie films.

And most interestingly for me:

Mac people are 80% more likely than PC people to be vegetarian.

Cutting edge cultural trail-blazers, or insufferable hipster trendoids? We report, you decide.

Talk the Talk Twofer: Google and Bing + Shit happens

Two Talk the Talk episodes have come down the pike today.

One is the “Google and Bing” story about dueling search engines and why being clever sometimes looks the same as being very stupid.

The other is about the phrase “shit happens“, which can get you into a lot of trouble if not handled correctly.

You can find older episodes on our Facebook page. Be sure to like us!

Google and Bing

When I heard that Google had accused Microsoft of copying Google’s results for their search engine Bing, it was like 1995 all over again. Great snakes, I thought, can’t Microsoft develop anything on its own? Yes, I still hold a grudge over Microsoft’s plagiarism of the MacOS. But it appears there’s a bit more to this story.

The way Google unearthed the alleged copying was reminiscent of ‘copyright traps‘ that map-makers set. You don’t want someone copying your map, so you insert fictional towns into it. If anyone else shows the same town on their maps, you know they must have copied you.

In Google’s case, they took the unusual move of hard-coding strings of nonsense letters (e.g. “mbzrxpgjys”) so that it would find a certain web page (say, a theatre in Los Angeles.) The page wouldn’t even have the search term in it — it was totally arbitrary.

Within a few weeks, sure enough, Bing’s results started to show a few of Google’s hard-coded results. Caught red-handed!

But Microsoft, it appears, wasn’t copying; at least, not directly. Bing uses crowd-sourcing, a legitimate and very smart kind of information. If you’re using the Bing toolbar or Internet Explorer (with ‘Suggested Sites’ on), it’s watching what you do and reporting it to Microsoft. So if you search for something (on Bing or Google), it watches which suggested page you go for, and it upweights that link. So that would explain why Google engineers, after trying the links a few times, would trip Bing’s sensor, and their nonsense link would get into Bing’s results.

I’m a Microsoft hater — I won’t have MS software on my computer, and I use iWork rather than Office — but I don’t think Microsoft is doing the outright plagiarism that Google is accusing them of. They’re not copying, they’re imitating. It is creepy to have your computer watching you, though, so if you don’t like it, then don’t use the Bing toolbar, and don’t use Internet Explorer. Good advice anyway.

Machine translation could save minority languages

We’ve seen Word Lens, which translates signs automatically. Now this:

Google Translate App works as you speak

In an attempt to break down language barriers the world over, Google have developed an App which allows you to translate your words into another language as you speak.

Users simply speak into the device and the Google Translate app translates your speech and then reads the translation out loud, all in real time.

The person you are conversing with can then respond in their own language and their translated words will be spoken back to you.

But language teachers and linguists can rest easy that they’re not about to be put out of a job just yet: currently the app only supports English and Spanish.

I’d love to play with a copy. At the moment, I suspect it’d be pretty rudimentary and error-prone, but there would be updates. The task of machine translation is as yet unsolved — or should I say the set of problems that converge on MT — but we keep seeing innovations that get us closer and closer to that goal, inch by inch.

If we ever do realise the goal of instantaneous, unconstrained automatic translation, communication would of course be the most obvious beneficiary, but the other would be minority languages. It could potentially save them.

I see it as similar to the OS wars of the 90’s, which, like language, was a conflict over standards. Computer operating systems, like languages, require a population of users who can exchange information (in this case, files) with each other. But cross-platform file compatibility issues made this difficult. Operating systems also run applications that won’t run on other systems, so there’s a disincentive to adopt an OS that doesn’t have the software you need. At the time, the Mac was on the bad end of that struggle — there were fewer installed users, and some programs weren’t available for the Mac. I remember feeling very concerned that the Mac OS would die out.

Then the Mac adopted standards that were in common use anyway — text was text no matter what computer you were on, jpegs were jpegs, and Word files didn’t need to be converted. (Perhaps Mac users should be grateful for Word after all.) You couldn’t run the exact same programs, but every computer became able to do mostly the same things: Java, Perl, Flash, and so on. And if you got really desperate, there were Windows emulators. So the cost of settling on a minority OS went way down.

What automatic machine translation does is lower the cost of maintaining a minority language. Languages like English or Mandarin have an irresistible attraction for speakers of other languages because they have a huge install base. They represent economic and social opportunity. If translation between them is easy, then using the other language isn’t an irretrievable commitment.

You could argue that the ease of translation would doom minority languages because the translation might only flow one way: toward the big language. That’s not what happened in the OS wars. People liked their Macs, and the ease of conversion helped them hang on to them. People like their languages, too. They’re important markers of their identity. But not if the cost is too high. MT would bring the cost down.

One space or two after a full stop?

Forget left and right wing, forget coriander lovers v haters. The real divide in our society is between the one spacers and the two spacers. And Slate’s restarted the war with this article: Space Invaders: Why you should never, ever use two spaces after a period

Two-spacers are everywhere, their ugly error crossing every social boundary of class, education, and taste. You’d expect, for instance, that anyone savvy enough to read Slate would know the proper rules of typing, but you’d be wrong; every third e-mail I get from readers includes the two-space error.

I’m a one spacer, and I’ll tell you why: Go to your bookshelf, open up any book, and look after any full stop. You’ll find one space. That’s how the pros do it.

Back in the days of typewriters, all the characters were monospaced, so an ‘m’ took up the same space as an ‘i’. A monospaced font will look like crap if there’s only one space after a full stop, so people were taught to use two. Nowadays, we have computers with well-designed typefaces, so you only need one space, as nature intended.

I can only see two reasons to use two spaces. Either you’re on a clanky old IBM Selectric, or you were taught to type by sadistic nuns who beat you if you forgot the extra space. The former can be cured with a computer, the latter with therapy.

Word Lens

This has to be the coolest machine translation idea I’ve seen for a while.

Imagine taking this app on vacation in a foreign country. Can language goggles be far behind?

Notice that it doesn’t always try to twiddle the word order, even where it would be appropriate. On the other hand, it must be doing all kinds of text and character recognition under the hood. Not to mention colour and size matching. I bet later on they’ll try font matching.

Now if I can just get a copy to play with on my iPhone.

And an iPhone.

Fatherly chat

He swears he was ‘shopped. 
Actually, this conversation is heavily edited and fictionalised. But we have had chats about managing social relations online, and it’s only slightly less fraught than the sex talk.

Joust marathon

John McAllister is challenging a 25-year-old Joust world record. It’s going on now, as I write this. I’m following the live video feed sporadically.

I found out about John’s attempt yesterday morning, had a look, and I thought, “Wow, he’s really good.” At that point, he’d been going for 22 hours.

Then I worked all day, came back to check out the game in the evening, and he was still going. Now I’ve had a night’s sleep, and he’s still going. He’ll need to go for about 60 hours total to beat the 107 million points. When he takes a break, he just walks away from the controls and burns off a few of the hundreds of extra guys that he’s built up.

Joust is a fast game at the higher levels, and the gameplay is more or less constant. It requires an almost cyborgian level of endurance, but there he is, working with precision at a frenetic pace. He always knows exactly where to be, whether facing the ‘unbeatable?’ pterodactyls, or taking on the blue knights at the top of the screen, predicting their unpredictable fluttery arcs.

So all right, yes, it is the same thing over and over again. And yes, it goes for a long time. Even so, I find the marathon to be strangely compelling viewing. Kind of like when I was a kid in Cheney, probably hanging out at Zip’s, watching someone who was really good. Video games are time machines.

UPDATE: He’s done it. All hail Sir John. His record will live in the annals of history. Ages hence, bards will sing of his jousting exploits, and maidens will swoon.

Or it’ll be YouTubed, which is close to immortality.

Emoticon test

Here’s a survey about emoticons that you can take. I recognised some, but others I had to guess.

I like to see what other linguists are doing, and it’s fun to guess what the work is intended for. I’d say this work is part of sentiment analysis: working out automatically how a writer is feeling about what they’re writing. Or tweeting.

So help a fellow linguist out and take the test. It’s quick, and sort of fun.

Who do you write like?

I pasted a longish blog post into I Write Like, and it said:

I write like
George Orwell

I Write Like by Mémoires, Mac journal software. Analyze your writing!

While I appreciate the compliment, I wish it would be more specific as to how it got that assessment. I can make a few guesses.

It seems obvious that this uses some kind of nearest-neighbour search. Take a corpus of authors, break their works into good-sized chunks, and then find the closest match for whatever the user gives you.

But what constitutes a match? We could use n-grams (words, and strings of words), as we do in many computational language tasks, but just matching the words in a book doesn’t mean you write like the author. Sure, Steinbeck and Faulkner wrote different words in their books just because of the topics they treated, but that’s not what we mean by writing style.

My guess is that writing style is more about patterns of words, especially function words like prepositions and conjunctions. (You may have noticed I start a lot of sentences with conjunctions like ‘but’ and ‘and’.) I’d try running all the words through a part-of-speech tagger, and see what matches that data best. Just a guess though.

I wonder if Orwell writes like Orwell. Here are three adjacent passages from Orwell’s Down and Out in Paris and London, with the computer’s assessment.

Or there was Henri, who worked in the sewers. He was a tall, melancholy man with curly hair, rather romantic-looking in his long, sewer-man’s boots. Henri’s peculiarity was that he did not speak, except for the purposes of work, literally for days together. Only a year before he had been a chauffeur in good employ and saving money. One day he fell in love, and when the girl refused him he lost his temper and kicked her. On being kicked the girl fell desperately in love with Henri, and for a fortnight they lived together and spent a thousand francs of Henri’s money. Then the girl was unfaithful; Henri planted a knife in her upper arm and was sent to prison for six months. As soon as she had been stabbed the girl fell more in love with Henri than ever, and the two made up their quarrel and agreed that when Henri came out of jail he should buy a taxi and they would marry and settle down. But a fortnight later the girl was unfaithful again, and when Henri came out she was with child, Henri did not stab her again. He drew out all his savings and went on a drinking-bout that ended in another month’s imprisonment; after that he went to work in the sewers. Nothing would induce Henri to talk. If you asked him why he worked in the sewers he never answered, but simply crossed his wrists to signify handcuffs, and jerked his head southward, towards the prison. Bad luck seemed to have turned him half-witted in a single day.

I write like
H. P. Lovecraft

I Write Like by Mémoires, Mac journal software. Analyze your writing!

Or there was R., an Englishman, who lived six months of the year in Putney with his parents and six months in France. During his time in France he drank four litres of wine a day, and six litres on Saturdays; he had once travelled as far as the Azores, because the wine there is cheaper than anywhere in Europe. He was a gentle, domesticated creature, never rowdy or quarrelsome, and never sober. He would lie in bed till midday, and from then till midnight he was in his comer of the bistro, quietly and methodically soaking. While he soaked he talked, in a refined, womanish voice, about antique furniture. Except myself, R. was the only Englishman in the quarter.

I write like
Charles Dickens

I Write Like by Mémoires, Mac journal software. Analyze your writing!

There were plenty of other people who lived lives just as eccentric as these: Monsieur Jules, the Roumanian, who had a glass eye and would not admit it, Furex the Liniousin stonemason, Roucolle the miser — he died before my time, though — old Laurent the rag-merchant, who used to copy his signature from a slip of paper he carried in his pocket. It would be fun to write some of their biographies, if one had time. I am trying to describe the people in our quarter, not for the mere curiosity, but because they are all part of the story. Poverty is what I am writing about, and I had my first contact with poverty in this slum. The slum, with its dirt and its queer lives, was first an object-lesson in poverty, and then the background of my own experiences. It is for that reason that I try to give some idea of what life was like there.

I Write Like by Mémoires, Mac journal software. Analyze your writing!

No wonder Orwell had writer’s block: schizophrenia.

UPDATE: Thanks to Kuri for that link in comments. It seems the author used

vocabulary (use of words), number of words, commas, and semicolons in sentences, number of sentences with quotation marks and dashes (direct speech).

I’d say this could be smartened up considerably. Just including some simple features would help, like the ratio of singletons (words appearing once) to other words, appearance of conjunctions, or ranking all the words by frequency and comparing lists.

This kind of makes me want to try building a better system. I won’t (for lack of time), but I think I will keep in mind that if you can take interesting work in natural language processing and make a simple web implementation, people will think it is interesting. You can also have a lot of English major hotheads sniping at you because you snubbed Toni Morrison. Wouldn’t that be fun!

 

Older posts Newer posts

© 2024 Good Reason

Theme by Anders NorenUp ↑