Showing posts with label Unstructured data. Show all posts
Showing posts with label Unstructured data. Show all posts

Tuesday, January 29, 2013

Long Data Is Still Big Data

Image representing Hadoop as depicted in Crunc...
Image via CrunchBase
You add the time dimension to Big Data and you get Long Data. Long Data is still Big Data.

Stop Hyping Big Data and Start Paying Attention to ‘Long Data’

crunching big numbers can help us learn a lot about ourselves. ..... But no matter how big that data is or what insights we glean from it, it is still just a snapshot: a moment in time. ..... as beautiful as a snapshot is, how much richer is a moving picture, one that allows us to see how processes and interactions unfold over time? ..... many of the thi

Structure of Evolutionary Biology - Blue
Structure of Evolutionary Biology - Blue (Photo credit: Wikipedia)
ngs that affect us today and will affect us tomorrow have changed slowly over time ...... Datasets of long timescales not only help us understand how the world is changing, but how we, as humans, are changing it — without this awareness, we fall victim to shifting baseline syndrome. This is the tendency to shift our “baseline,” or what is considered “normal” — blinding us to shifts that occur across generations (since the generation we are born into is taken to be the norm). ..... Shifting baselines have been cited, for example, as the reason why cod vanished off the coast of the Newfoundland: overfishing fishermen failed to see the slow, multi-generational loss of cod since the population decrease was too slow to notice in isolation. ..... Fields such as geology and astronomy or evolutionary biology — where data spans millions of years — rely on long timescales to explain the world today. History itself is being given the long data treatment, with scientists attempting to use a quantitative framework to understand social processes through cliodynamics, as part of digital history. Examples range from understanding the lifespans of empires (does the U.S. as an “empire” have a time limit that policy makers should be aware of?) to mathematical equations of how religions spread (it’s not that different from how non-religious ideas spread today). ...... building a clock that can last 10,000 years .... the 26,000-year cycle for the precession of equinoxes ...... Just as big data scientists require skills and tools like Hadoop, long data scientists will need special skillsets. Statistics are essential, but so are subtle, even seemingly arbitrary pieces of knowledge such as how our calendar has changed over time
Enhanced by Zemanta

Sunday, February 12, 2012

Another Ode To Big Data


New York Times: The Age of Big Data
an explosion of data — Web traffic and social network comments, as well as software and sensors that monitor shipments, suppliers and customers — to guide decisions, trim costs and lift sales ...... the United States needs 140,000 to 190,000 more workers with “deep analytical” expertise and 1.5 million more data-literate managers, whether retrained or hired ...... The story is similar in fields as varied as science and sports, advertising and public health — a drift toward data-driven discovery and decision-making. “It’s a revolution” ...... the march of quantification, made possible by enormous new sources of data, will sweep through academia, business and government. There is no area that is going to be untouched ...... Welcome to the Age of Big Data. ...... data a new class of economic asset, like currency or gold ...... Big Data has the potential to be “humanity’s dashboard,” an intelligent tool that can help combat poverty, crime and pollution. Privacy advocates take a dim view, warning that Big Data is Big Brother, in corporate clothing. ........ a lot more data, all the time, growing at 50 percent a year, or more than doubling every two years ....... It’s not just more streams of data, but entirely new ones. ....... there are now countless digital sensors worldwide in industrial equipment, automobiles, electrical meters and shipping crates. They can measure and communicate location, movement, vibration, temperature, humidity, even chemical changes in the air. ........ the Internet of Things or the Industrial Internet. ....... Data is not only becoming more available but also more understandable to computers. Most of the Big Data surge is data in the wild — unruly stuff like words, images and video on the Web and those streams of sensor data. It is called unstructured data and is not typically grist for traditional databases. ........ the computer tools for gleaning knowledge and insights from the Internet era’s vast trove of unstructured data are fast gaining ground. At the forefront are the rapidly advancing techniques of artificial intelligence like natural-language processing, pattern recognition and machine learning ....... The wealth of new data, in turn, accelerates advances in computing — a virtuous circle of Big Data. Machine-learning algorithms, for example, learn on data, and the more data, the more the machines learn. Take Siri ....... The microscope, invented four centuries ago, allowed people to see and measure things as never before — at the cellular level. It was a revolution in measurement. ....... Data measurement.... is the modern equivalent of the microscope. Google searches, Facebook posts and Twitter messages, for example, make it possible to measure behavior and sentiment in fine detail and as it happens. ....... decisions will increasingly be based on data and analysis rather than on experience and intuition. “We can start being a lot more scientific” ........ the low-budget Oakland A’s massaged data and arcane baseball statistics to spot undervalued players. Heavy data analysis had become standard not only in baseball but also in other sports, including English soccer, well before last year’s movie version of “Moneyball,” starring Brad Pitt. ...... Walmart and Kohl’s, analyze sales, pricing and economic, demographic and weather data to tailor product selections at particular stores and determine the timing of price markdowns. Shipping companies, like U.P.S., mine data on truck delivery times and traffic patterns to fine-tune routing. ....... Police departments across the country, led by New York’s, use computerized mapping and analysis of variables like historical arrest patterns, paydays, sporting events, rainfall and holidays to try to predict likely crime “hot spots” and deploy officers there in advance. ....... data-guided management is spreading across corporate America and starting to pay off. ...... studied 179 large companies and found that those adopting “data-driven decision making” achieved productivity gains that were 5 percent to 6 percent higher than other factors could explain. ...... The predictive power of Big Data is being explored — and shows promise — in fields like public health, economic development and economic forecasting. Researchers have found a spike in Google search requests for terms like “flu symptoms” and “flu treatments” a couple of weeks before there is an increase in flu patients coming to hospital emergency rooms in a region (and emergency room reports usually lag behind visits by two weeks or so). ....... sentiment analysis of messages in social networks and text messages — using natural-language deciphering software — to help predict job losses, spending reductions or disease outbreaks in a given region. The goal is to use digital early-warning signals to guide assistance programs in advance to, for example, prevent a region from slipping back into poverty. ...... trends in increasing or decreasing volumes of housing-related search queries in Google are a more accurate predictor of house sales in the next quarter than the forecasts of real estate economists ....... social-network research involves mining huge digital data sets of collective behavior online. Among the findings: people whom you know but don’t communicate with often — “weak ties,” in sociology — are the best sources of tips about job openings. They travel in slightly different social worlds than close friends, so they see opportunities you and your best friends do not. ...... Researchers can see patterns of influence and peaks in communication on a subject — by following trending hashtags on Twitter, for example. The online fishbowl is a window into the real-time behavior of huge numbers of people. ...... Big Data has its perils, to be sure. With huge data sets and fine-grained measurement, statisticians and computer scientists note, there is increased risk of “false discoveries.” ...... “many bits of straw look like needles.” ...... Big Data also supplies more raw material for statistical shenanigans and biased fact-finding excursions. It offers a high-tech twist on an old trick: I know the facts, now let’s find ’em. ..... Data is tamed and understood using computer and mathematical models. These models, like metaphors in literature, are explanatory simplifications. They are useful for understanding, but they have their limits. A model might spot a correlation and draw a statistical inference that is unfair or discriminatory, based on online searches, affecting the products, bank loans and health insurance a person is offered ...... Veteran data analysts tell of friends who were long bored by discussions of their work but now are suddenly curious. .... “The culture has changed” .... “There is this idea that numbers and statistics are interesting and fun. It’s cool now.”

Big Data Democratization By Wolfram Alpha
Big Data
Facebook And Big Data
Big Data + Smartphone = New Generation Smartphone
Big Data: Big News

Thursday, May 21, 2009

Distributed Search

WWW's "historical" logo, created by ...Image via Wikipedia


Silicon Valley VCs Don't Want Obama's Money, Think Google Is Passe CNet "The triumph of the distributed Web." ....... the aggregate power of distributed human activity will trump centralized control. ........ Google, and other search engines that analyze the Web and links, are much less useful than a (theoretical) search engine that knows not what people have linked to (as Google does), but rather what pages are open on people's browsers at the moment that people are searching ........ "All the problems of search would be solved if search relevance was ranked by what browsers were displaying" ......... the future is "federated search," in which the Web's users don't just execute search queries, they participate in building the index by the very act of searching, immediately and directly. ............you probably also want to know what's showing up on users' computers in apps other than the Web browser. ....... the value of real-time searching, as well as social-network-aware searching, will increase dramatically and quickly. .......... the United State's subsidies on ethanol, France's decision to skip the Internet in favor of the state-sponsored Minitel, and Japan's direct investment in supercomputers as it tried to spend its way out of a recession were examples of poor investments. "Government is a particularly poor judge of new technology" ....... . The millennials are here. Everything changes. The current generation of graduating college students won't remember a life offline. A deluge of unstructured data creates the next great information leaders. ("The dark matter of the enterprise is unstructured data" ........ Wireless broadband will be one of the only IT sectors to see increased funding this year and in the future......... Maintech, not Cleantech ....... Health care administration will be the fastest-growing sector. ......Consumption of digital goods on mobile devices is the growth story of the coming decade. ....... Electronic displays will prove the hottest investment in hardware this year and next. ..... The rumors of the demise of the reporter have been exaggerated.
David Gelernter: Manifesto

Is Google passe? I am not so sure. But Twitter and Wolfram Alpha have put some blood into the water. Suddenly Yahoo and Microsoft are energized on search like never before. The wisdom was Yahoo has lost the game. Now Yahoo is yelling, not so fast. Google has been challenged, that's for sure.

Goal: A Billion People On Twitter
Search Come Full Circle: That Human Element
Search: Much Is Lacking

This had to happen sooner or later. Search is such a vast terrain. It will stay the number one challenge online. Content creation is not about to slow down. How do you keep up with the exponential explosion in content creation? You got to come up with new ways of doing search. There has always been a ton of room for innovation in search. But now we are seeing new energy in the domain.

Wolfram Apha Is Cool
Google Falling Behind Twitter?
Eminem: The Relapse: Twitter
What Does Your Resume Look Like Today?
Google Is Working On Search
Job Hunting And 2.0
Is Reading Socializing?
Reimagining The Office
Stream 2.0: The Next Big Thing?
Microfinance, Nanotech, Biotech, Software/Hardware/Connectivity
Define Social Media
The Stream, The Lifestream, The Mindstream
The Human Is The Center Of Gravity In Computing

Reblog this post [with Zemanta]