March 17, 2016 communication messaging

Fleep: Email compatibility - YouTube

Skip navigation

Sign in

Search

Loading…

Close

Yeah, keep it Undo Close

This video is unavailable.

Watch Queue

Queue

Watch QueueQueue

Remove all
Disconnect

Loading…

Watch Queue

Queue

count/total

Find out whyClose

Fleep: Email compatibility

SubscribeSubscribedUnsubscribe9090

Loading…

Loading…

Working…

Add to

Want to watch this again later?

Sign in to add this video to a playlist.

Share

More

Report

Need to report the video?

Sign in to report inappropriate content.

Transcript
Statistics

168 views

1

Like this video?

Sign in to make your opinion count.

2 0

Don’t like this video?

Sign in to make your opinion count.

1

Loading…

Loading…

Transcript

The interactive transcript could not be loaded.

Loading…

Loading…

Rating is available when the video has been rented.

This feature is not available right now. Please try again later.

Published on Feb 5, 2016

Fleep is compatible with email - this means you can receive and send emails in Fleep. This video shows how.

Category
- Science & Technology
License
- Standard YouTube License

Show more Show less

Loading…

Autoplay When autoplay is enabled, a suggested video will automatically play next.

Up next

Intro to Fleep - Duration: 3:17. Fleep 158 views

3:17

1:00:07

Using Fleep.io to provide banking services across Europe - Duration: 2:55. Fleep 18,578 views

2:55

Fleep Tasks - Duration: 2:31. Fleep 135 views

2:31

(1 HOUR) Cognition Enhancer - Clearer, Smarter Thinking - Learning & Intelligence ISOCHRONIC - Duration: 1:00:04. Brainwave Power Music 491,149 views

1:00:04

Fleep: Mute, Unmute, Automute - Duration: 0:58. Fleep 48 views

0:58

Professor Griff & Zaza Ali - Technology Destroying Society & Damaging Our Health (Full Video) - Duration: 1:13:08. openupyourmind101 4,401 views

1:13:08

Manning the Future Fleet: A Maritime Security Dialogue Event - Duration: 1:00:32. Center for Strategic & International Studies 391 views

1:00:32

Asus Chromebook Flip Unboxing and First Impressions - Duration: 12:18. Tim Schofield (Qbking77) 20,257 views

12:18

Should I wear my Luxury Wrist Watch while I sleep? - Duration: 5:49. ARCHIELUXURY 4,113 views

5:49

TEENS REACT TO 90s INTERNET - Duration: 9:56. Fine Brothers Entertainment 14,574,530 views

9:56

What is Fleep? - Duration: 1:21. Fleep 156,161 views

1:21

Fleep Presence Indicators - Duration: 0:49. Fleep 153 views

0:49

Teens Use Flip Phones For The First Time - Duration: 3:52. BuzzFeedBlue 3,297,801 views

3:52

Nikken Kenko Naturest Sleep System - The Healthy Choice - Duration: 6:35. Luma Sun 18,321 views

6:35

Panel: “How has Fleep changed your business communications?” (English subtitles) - Duration: 13:08. Fleep 319 views

13:08

DZ09 Smart Watch Operation Demo - Duration: 3:43. Ryan Patrick 72,280 views

3:43

“Why Fleep?: early beginnings of Fleep and moving beyond email” - Henn Ruukel, CEO of Fleep - Duration: 8:56. Fleep 199 views

8:56

Smartwatch U Watch U8 hands-on review: Control your phone on wrist - Duration: 4:13. GeekBuying 1,037,236 views

4:13

Loading more suggestions…

Show more

Language: English
Country: Worldwide
Restricted Mode: Off History Help

Loading…

Loading…

Loading…

Working…

Sign in to add this to Watch Later

Add to

Loading playlists…

March 16, 2016 communication messaging

Fleep gets an update - Business Insider

Fleep founders FleepLeft to right: Henn Ruukel, Asko Oja, Liis Peetermann, Erik Laansoo, Marko Kreen, Andres Järviste.Fleep — a messaging platform launched by six Estonians, including four former Skype engineers — is set to launch a new feature as it looks to go viral in the same way as Whatsapp.

Fleep allows people to IM each other, and keep files and pinned notes synced across multiple devices. In order to message someone over Fleep, the user needs to enter the recipient’s email address or Fleep username.

Ultimately, Fleep wants to change the way businesses and individuals communicate. “Leave email behind and manage all conversations with your team, partners and clients in Fleep,” the company writes on its website.

Fleep CEO Henn Ruukel, who used to be director of engineering within the Skype for Business group, told Business Insider today about the changes he is making to help Fleep go viral.

With just 35,000 users across Europe and the US, Fleep has a long way to go if it wants to scale to the same size as platforms like Whatsapp and Skype.

Firstly, the company, which employs 15 people in total, has overhauled its app within the last few weeks with a completely new user interface (Fleep 2.0) that Ruukel hopes will make people want to share it with their friends.

But that’s not all. Ruukel realises that Fleep needs to further expand its product and offer deeper integration with other messaging services if he wants to get more users.

Fleep Fleep

Currently users can only send and receive email to their Fleep ID’s (e.g. sam@fleep.io) but that’s about to change.

“In early 2016, Fleep will launch an email integration feature, which will enable users to send and receive email messages in Fleep while maintaining their existing email addresses,” said Ruukel, adding that he’s optimistic the new feature will help Fleep to ramp up its user numbers.

Fleep also hopes to scale up its business by raising another round of funding over the next 6 months. So far the company has raised €1.9 million (£1.34 million), including an undisclosed amount from Skype cofounder Jaan Tallinn.

“I really don’t like email as a communication method and there’s a good reason for this,” Tallinn told Techworld in April. “It was invented in the 70s and it was intended for much lower volumes.”

He continued: “Fleep is kind of like Skype IM on steroids because it was developed by people who were behind Skype IM in the first place. All the familiar mechanisms have been imported and it works very much like Skype IM, except that it’s much better.

“We absolutely have to get more people using it. I think in terms of features it’s already up there with the big players of the world but in terms of users, no.”

March 16, 2016 communication messaging

Fleep, The Team Messaging App Built And Funded By Ex-Skypers, Flicks Monetization Switch

Fleep, the team messaging app built and backed by a number of ex-Skype engineers, is flicking the monetisation switch today. A year after launching as a free public beta, the Estonian startup is introducing a freemium revenue model that sees users on its paid tier — €3 per month per user — get access to unlimited message history and files, while free users can only access messages from the last 30 days.

That cut off point, says Fleep co-founder and CEO Henn Ruukel, means Premium customers are still able to communicate with non-core team members or external partners on an ad-hoc and free basis, keeping the service as a viable alternative to email.

“If we would have chosen a paid-only model it would limit usage and people would fall back to email, while always-free model would eventually end in indirect monetization through ads or something ugly,” he tells me in a Fleep chat. “I think we were able to draw the line between Free and Premium so it feels fair and is easy to understand.”

In addition to unlimited message and file history, Fleep will soon add “advanced management features” for subscribers to its paid-for Premium service, including team management and administered chats — giving company admins the ability to add and remove users from Fleep chats. That’s no doubt a much-requested enterprise feature, despite Fleep’s positioning as a team messaging app that retains the ‘openness’ of email and its ad-hoc collaborative nature, coupled with the advantage of being a modern messaging platform, including better search, organisational tools and file management. After all, companies still need to maintain a high level of control over company communication, for competitive, political and compliance purposes.

Fleep-iPhone A quick recap of how Fleep works and what makes it different from other team messaging apps, including upstarts such as Cotap, a messaging startup co-founded by two ex-Yammer executives, IMbox.me, backed by ex-Nokia President and CEO Olli Pekka Kallasvuo, TigerText, along with the likes of Microsoft-owned Yammer, Convo, Slack, and HipChat, all of which broadly play in the same space:

To start a new conversation in Fleep, you click on the ‘create new’ button and enter the names of those who you want to see become part of the conversation. If they aren’t already using the app, you can enter their email address instead where they’ll be able to interface with the conversation via email by hitting reply, although in this instance their contribution also gets pulled into Fleep. In other words, not all participants need to be using the app.

To that end, I asked Ruukel why the decision was made to introduce paid tiers now? “Mainly in order to provide clarity to our users,” he says. “Since March when we launched Fleep apps, many have asked how much Fleep will cost, as this is one of the aspects to consider when selecting tools for the team.”

With today’s newly-introduced freemium model, Fleep is finally providing that clarity.

February 26, 2016 database

Inside Libpostal - a fast, multilingual, international street address parser trained on OpenStreetMap data · Mapzen

For the past year, data scientist Al Barrentine has been working with Mapzen to crack one of the hardest problems in geocoding and place search: international address parsing. It’s resulted in Libpostal, a state-of-the-art, lightning-fast C library and statistical model for parsing and normalizing addresses around the world. The address parser alone is 98.9% accurate. And by virtue of being written in C, libpostal can be used directly from several popular languages, with bindings already written for Python, Go, Ruby, Java, and NodeJS.

The world is a big place, but Libpostal is a big step toward making it easier to find any place anywhere (and it only uses open data). We at Mapzen are incredibly excited to soon be using Libpostal as a key part of Mapzen Search and we can’t wait to see what you use it for!

Here, Al explains just how Libpostal came to be and, importantly, shares how it works so others can benefit from what he learned.

Street addresses are among the more quirky artifacts of human language, yet they are crucial to the increasing number of applications involving maps and location. Last year I worked on a collaboration with Mapzen with the goal of building smarter, more international geocoders using the vast amounts of local knowledge in open geographic data sets.

The result is libpostal: a multilingual street address parsing/normalization library, written in C, that can handle addresses all over the world.

Libpostal uses machine learning and is informed by tens of millions of real-world addresses from OpenStreetMap. The entire pipeline for training the models is open source. Since OSM is a dynamic data set with thousands of contributors and the models are retrained periodically, improving them can be as easy as contributing addresses to OSM.

Each country’s addressing system has its own set of conventions and peculiarities and libpostal is designed to deal with practically all of them. It currently supports normalizations in 60 languages and can parse addresses in more than 100 countries. Geocoding using libpostal as a preprocessing step becomes drastically simpler and more consistent internationally.

The core library is written in pure C, which means that in addition to having a small carbon footprint, libpostal can be used from almost any stack or programming language. There are currently bindings written for Python, Go, Ruby, Java, and NodeJS with more popular languages coming soon.

But let’s rewind for a moment.

Why we care about addresses

Addresses are the unique identifiers humans use to describe places, and are at the heart of virtually every facet of modern Internet-connected life: map search, routing/directions, shipping, on-demand transportation, delivery services, travel and accommodations, event ticketing, venue ratings/reviews, etc. There’s a $1B company in almost every one of those categories.

The central information retrieval problem when working with addresses is known as geocoding. We want to transform the natural language addresses that people use to describe places into lat/lon coordinates that all our awesome mapping and routing software uses.

Geocoding’s not your average document search. Addresses are typically very short strings, highly ambiguous, and chock full of abbreviations and local context. There is usually only one correct answer to a query from the user’s perspective (with the exception of broader searches like “restaurants in Fort Greene, Brooklyn”). In some instances we may not even have the luxury of user input at all e.g. batch geocoding a bunch of addresses obtained from a CSV file, the Web or a third-party API.

Despite these idiosyncrasies, we tend to use the same full-text search engines for addresses as we do for querying traditional text documents. Out of the box, said search engines are terrible at indexing addresses. It’s easy to see how a naïve implementation could pull up addresses on St Marks Ave when the query was “St Marks Pl” (both the words “Ave” and “Pl” have a low inverse document frequency and do not affect the rank much). Autocomplete might yield addresses on the 300 block of Main Street for a query of “30 Main Street”. Abbreviations like “Saint” and “St” which are not simple prefix overlaps might not match in most spellcheckers since their edit distance is greater than 2.

Typically we employ all sorts of heuristics to help with address matching: synonyms lists, special tokenizers, analyzers, regexes, simple parsers, etc. Most of these methods require changing the search engine’s config, and make US/English-centric, overly-simplified assumptions. Even using a full-text search engine in general won’t help in the server-side batch geocoding case unless we’re fully confident that the first result is the correct one.

Geocoding in 2016

Libpostal began with the idea that geocoding is more similar to the problem of record linkage than text search.

The question we want to be able to answer is: “are two addresses referring to the same place?” Having done that, we can simultaneously make automated decisions in the batch setting and return more relevant results in user-facing geocoders.

This decomposes into two sub-problems:

Normalization: the easiest way to handle all the abbreviated variations and ambiguities in addresses is to produce canonical strings suitable for machine comparison, i.e. make “30 W 26th St” equal to “Thirty West Twenty-Sixth Street”, and do it in every language.
Parsing: some components of an address are more essential than others, like house numbers, venue names, street names, and postal codes. Beyond that, addresses are highly structured and there are multiple redundant ways of specifying/qualifying them. “London, England” and “London, United Kingdom” specify the same location if parsed to mean city/admin1 and city/country respectively. If we already know London, there would be no point in returning addresses in Manchester simply because it’s also in the UK.

Once we’ve got canonical address strings segmented into components, geocoding becomes a much simpler string matching problem, the kind that full-text search engines and even relational/non-relational databases are good at handling. With a little finesse one could conceivably geocode with nothing but libpostal and a hash table.

To see how that’s possible, the next two sections describe in detail how libpostal addresses (pun very much intended) the normalization and parsing problems respectively.

Multilingual address normalization

Normalization is the process of converting free-form address strings encountered in the wild into clean normalized forms suitable for machine comparison. This is primarily deterministic/rule-based.

Address normalization using libpostal’s Python bindings

There are several steps involved in making normalization work across so many different languages. I’ll mention the notable ones.

Multilingual tokenization

Tokenization is the process of segmenting text into words and symbols. It is the first step in most NLP applications, and there are many nuances. The tokenizer in libpostal is actually a lexer implementing the Unicode Consortium’s TR-29 spec for unicode word segmentation. This method handles every script/alphabet, including ideograms (used in languages not separated by whitespace e.g. Chinese, Japanese, Korean), which are read one character at a time.

The tokenizer is inspired by the approach in Stanford’s CoreNLP i.e. write down a bunch of regular expressions and use compile them into a fast DFA. We use re2c, a light-weight scanner generator which often produces C that’s as fast as a handwritten equivalent. Indeed, tokenization is quite fast, chunking through > 2 million tokens per second.

Abbreviation expansion

Almost every language on Earth uses abbreviations in addresses. Historically this had to do with width constraints on things like street signs or postal envelopes. Digital addresses face similar constraints, namely that they are more likely than other types of text to be viewed on a mobile device.

Abbreviations create ambiguity, as there are multiple ways of writing the same address with different degrees of verbosity: “W St Johns St”, “W Saint Johns St”, “W St Johns Street”, and “West Saint Johns Street” are all equivalent. There are similar patterns in most languages.

For expanding abbreviations to their canonical forms, libpostal contains a number of per-language dictionaries, which are simple text files mapping “Rd” to “Road” in 60 languages. Each word/abbreviation can have one or more canonical forms (“St” can expand to “Street” or “Saint” in English), and one or more dictionary types: directionals, street suffixes, honorifics, venue types, etc.

Dictionary types make it possible to control which expansions are used, say if the input address is already separated into discrete fields, or if using libpostal’s address parser to the same effect. With dictionary types, it’s possible to apply only the relevant expansions to each component. For instance, in an English address, “St.” always means “Saint” when used in a city or country name like “St. Louis” or “St. Lucia” and will only be ambiguous when used as part of a street or venue/building name.

The dictionaries are compiled into a trie data structure, at which point a fast search algorithm is used to scan through the string and pull out matching phrases, even if they span multiple words (e.g. “State Route”). This type of search also allows us to treat multi-word phrases as single tokens during address parsing.

Ideographic languages like Japanese and Korean are handled correctly, even though the extracted phrases are not surrounded by whitespace. So are Germanic languages where street suffixes are often appended onto the end of the street name, but may optionally be separated out (Rosenstraße and Rosen Straße are equivalent). All of the abbreviations listed on the OSM Name Finder wiki are implemented as of this writing, plus many more.

At the moment, libpostal does not attempt to resolve ambiguities in addresses, and often produces multiple potential expansions. Some may be nonsensical (“Main St” expands to both “Main Street” and “Main Saint”), but the correct form will be among them. The outputs of libpostal’s expand_address can be treated as a set and address matching can be seen as a doing a set intersection, or a JOIN in SQL parlance. In the search setting, one should index all of the strings produced, and use the same code to normalize user queries before sending them to the search server/database.

Future iterations of expand_address will probably use OpenStreetMap (where abbreviation is discouraged) to build classifiers for ambiguous expansions, and include an option for outputs to be ranked by likelihood. This should help folks who need a “single best” expansion e.g. when displaying the results on a map.

Address language classification

Abbreviations are language-specific. Consider expanding the token “St.” in an address of unknown language. The canonical form would be “Sankt” in German, “Saint” in French, “Santo” in Portuguese, and so on.

We don’t actually want to list all of these permutations. In most user-facing geocoders, we likely know the language ahead of time (say from the user’s HTTP headers or current location). However, in batch geocoding, we don’t know the language of any of our input addresses, so will need a classifier to predict languages automatically using only the input text.

Language detection is a well-studied problem and there are several existing implementations (such as Chromium’s compact language detector) which achieve very good results on longer text documents such as Wikipedia articles or webpages. Unfortunately, because of some of the aforementioned differences between addresses and other forms of text, packages like CLD which are trained on webpages usually expect more/longer words than we have in an abbreviated address, and will often get the language wrong or fail to produce a result at all.

So we’ll need to build our own language classifier and train it specifically for address data. This is a supervised learning problem, which means we’ll need a bunch of address-related input labeled by language, like this:

  de  Graf-Folke-Bernadotte-Straße
  sv  Tollare Träskväg
  nl  Johannes Vermeerstraat Akersloot
  it  Strada Provinciale Ca' La Cisterna
  da  Østervang  Vissenbjerg
  nb  Lyngtangen Egersund
  en  Wood Point Road
  ru  улица Солунина
  ar  جادة صائب سلام
  fr  Rue De Longpré
  he  השלום
  ms  Jalan Sri Perkasa
  cs  Jeřabinová  Rokycany
  ja  山口秋穂線
  ca  Avinguda Catalunya
  es  calle Camilo Flammarión
  eu  Mungialde etorbidea
  pt  Rua Pedro Muler Faria</pre>

Sounds great, but where are we going to find such a data set? In libpostal, the answer to that question is almost always: use OpenStreetMap.

OSM has a great system when it comes to languages. By default the name of a place is the official local language name, rather than the Anglicized/Latinized name. Beijing’s default name for instance is “北京市” rather than “Beijing” or “Peking.”

Some addresses in OSM are explicitly labeled by language, especially in countries with multiple official street sign languages like Hong Kong, Belgium, Algeria, Israel, etc. In cases where a single name is used, we build an R-tree polygon index that can answer the question: for a given lat/lon, which official and/or regional language(s) should I expect to see? In Germany we expect addresses to be in German. In some regions of Spain, Catalan or Basque or Galician will be returned as the primary language we expect to see on street signs, whereas (Castilian) Spanish is used as a secondary alternative. In cases where languages are equally likely to appear, the language dictionaries in libpostal are used to help disambiguate. Lastly, street signs are always be written in the languages spoken by the majority of people, a vestige of linguistic imperialism, and the language index accounts for this as well.

All said and done, this process produces around 80 million language-labeled address strings. From there we extract features (informative attributes of the input which help to predict the output) similar to those used in Chromium and the language detection literature: sequences of 4 letters or 1 ideogram, whole tokens for words shorter than 4 characters, and a shortcut for unicode scripts mapping to a single language like Greek or Hebrew. Specific to our use case, we also include entire phrases matching certain language dictionaries from libpostal.

We then train a multinomial logistic regression model (also known as softmax regression) using stochastic gradient descent and a few sparsity tricks to keep training times reasonably fast. Logistic regression is heavily used in NLP because unlike Naïve Bayes, it does not make the assumption that input features are independent, which is unrealistic in language.

Another nice property of logistic regression is that its output is a well-calibrated probability distribution over the labels, not just normalized scores that look like probabilities if you “close one eye and squint with the other.” With real probabilities we can implement meaningful decision boundaries. For instance, if the top language returned by the classifier has a probability of 0.99, we can safely ignore the other language dictionaries, whereas if it makes a less confident prediction like 0.62 French and 0.33 Dutch, we might want to throw in both dictionaries. Though the latter type of output should not be interpreted as the distribution of languages in the address itself (as in a multi-label classifier), results with multiple high-probability languages are most often returned in cases like Brussels where addresses actually are written in two languages.

Numeric expression parsing

In many addresses, particularly on the Upper East Side of Manhattan it seems, numbers are written out as words e.g. “Eighty-sixth Street” instead of “86th Street.” Libpostal uses a simplified form of the Rule-based Number Format (RBNF) in CLDR which spells out the grammatical rules for parsing/spelling numbers in various languages.

Rather than try to exhaustively list all numbers and ordinals that might be used in an address, we supply a handful of rules which the system can then use to parse arbitrary numbers.

In English, when we see the word “hundred”, we multiply by any number smaller than 100 to the left and add any number smaller than 100 to the right. There’s a recursive structure there. If we know the rule for the hundreds place, and we know how to parse all numbers smaller than 100, then we can “count” up to 1000.

Numeric spellings can get reasonably complicated in other languages. French for instance uses some Celtic-style numbers which switch to base 20, so “quatre-vignt-douze” (“four twenties twelve”) = 92. Italian numbers rarely contain spaces so “milleottocentodue” = 1802. In Russian, ordinal numbers can have 3 genders. Libpostal parses them all, currently supporting numeric expressions in over 30 languages.

Roman numerals can be optionally recognized in any language (so “IX” normalizes to 9), though they’re most commonly found in Europe in the titles of popes, monarchs, etc. In most cases Roman numerals are the canonical form, and can be ambiguous with other tokens (a single “I” or “V” could also be a person’s middle initial), so a version of the string with unnormalized Roman numerals is added as well.

Transliteration

Many addresses around the world are written in a non-Latin scripts such as Greek, Hebrew, Cyrillic, Han, etc. In these cases, addresses can be written in the local alphabet or transliterated i.e. converted to a Latin script equivalent. Because the target script is usually Latin, transliteration is also sometimes known as “Romanization.”

For example, “Тверская улица” in Moscow transliterates to “Tverskaya ulitsa.” A restaurant website would probably use the former for its Russian site and the latter for its international site. Street signs in many countries (especially those who’ve at some point hosted a World Cup) will typically list both versions, at least in major cities.

Libpostal takes advantage of all the transliterators available in the Unicode Consortium’s Common Locale Data Repository (CLDR), again compiling them to a trie for fast runtime performance. The implementation is lighter weight than having to pull in ICU, which is a huge dependency and may conflict with system versions.

Each script or script/language combination can use one or more different transliterators. There are for instance several differing standards for transliterating Greek or Hebrew, and libpostal will try them all.

There’s also a simpler transliterator, the Latin to ASCII transform, which converts “œ” to “oe”, etc. This is in addition to standard Unicode normalization, which would decompose “ç” into “c” and “COMBINING CEDILLA (U+0327)”, and optionally strip the diacritical mark to make it just “c.” Accent stripping is sort of an “ignorant American” type of normalization, and can change the pronunciation or meaning of words. Still, sometimes addresses have to be written in an ASCII approximation (because keyboards), especially with travel-related searching, so we do strip accent marks by default, with an optional flag to prevent it.

Some countries actually translate addresses into English (something like “Tverskaya Street”), creating further ambiguity. At the cost of potentially adding a few bogus normalizations, libpostal can handle such translations by simply adding English dictionaries as a second language option for certain countries/languages/scripts.

International address parsing

Parsing is the process of segmenting an address into components like house number, street name, city, etc. Though many address parsers have been written over the years, most are rule-based and only designed to handle US addresses. In libpostal we develop the first NLP-based address parser that works well internationally.

Parsing addresses with libpostal’s command-line client

The NLP approach to address parsing

International address parsing is something we could never possibly hope to solve deterministically with something like regex. It might work reasonably well for one country, as addresses tend to be highly structured, but there are simply too many variations and ambiguities to make it work across languages. This sort of problem is where machine learning, particularly in the form of structured learning, really shines.

Most NLP courses/tutorials/libraries focus on models and algorithms, but applications on real-world data sets are not in great abundance. Libpostal provides an example of what an end-to-end production-quality NLP application looks like. I’ll detail the relevant steps of the pipeline below, all of which are open source and published to Github as part of the repository.

Creating labeled data from OSM

OpenStreetMap addresses are already separated into components. Here’s an example of OSM tags as JSON:

{
    "addr:housenumber": "30",
    "addr:postcode": "11217",
    "addr:street": "Lafayette Avenue",
    "name": "Brooklyn Academy of Music"
}

This is exactly the kind of output we want our parser to produce. These addresses are hand-labeled by humans and there are lots of them, more than 50 million at last count.

We want to construct a supervised tagger, meaning we have labeled text at training time, but only unlabeled text (geocoder input) at runtime. The input to a sequence model is a list of tagged tokens. Here’s an example of the for the address above:

Brooklyn/HOUSE Academy/HOUSE of/HOUSE Music/HOUSE
30/HOUSE_NUMBER Lafayette/ROAD Avenue/ROAD Brooklyn/CITY NY/STATE 11217/POSTCODE

At runtime, we’ll only expect to see “Brooklyn Academy of Music, 30 Lafayette Avenue, Brooklyn, NY 11217”, potentially without the commas. With a little creativity, we can reconstruct the free-text input, and tag each token to produce the above training example.

Notice that the original OSM address has no structure/ordering, so we’ll need to encode that somewhere. For this, we can use OpenCage’s address-formatting repo, which defines address templates for almost every country in the world, with coverage increasing steadily over time. In the US, house number comes before street name (“123 Main Street”), whereas in Germany or Spain it’s the inverse (“Calle Ruiz, 3”). The address templates are designed to format OSM tags into human-readable addresses in every country. This is a good approximation of how we expect geocoder input to look in those countries, which means we have our input strings. I’ve personally contributed a few dozen countries to the repo and it’s getting better coverage all the time.

Also notice that in the OSM address, city, state, and country are missing. We can “fill in the blanks” by checking whether the lat/lon of the address is contained in certain administrative polygons. So that we don’t have to look at every polygon on Earth for every lat/lon, we construct an R-tree to quickly check bounding box containment, and then do the slower, more thorough point-in-polygon test on the bounding box matches. The polygons we use are a mix of OSM relations, Quattroshapes/GeoNames localities, and Zetashapes for neighborhoods.

Making the parser robust

Because geocoders receive a wide variety of queries, we then perturb the address in several ways so the model has to train on many different kinds of input. With certain random probabilities, we use:

Alternate names: for some of the admin polygons (e.g. “NYC”, “New York”, “New York City”) so the model sees as many forms as possible
Alternate language names: OSM does a great job of handling language in addresses. By default a tag like “name” can be assumed to be in the local official language, or hyphenated if there’s more than one language. Something like “name:en” would be the English version. In countries with multiple official languages like Hong Kong, addresses almost always have per-language tags. We use these whenever possible.
Non-standard polygons: like boroughs, counties, districts, neighborhoods, etc. which may be occasionally seen in addresses
ISO codes and state abbreviations: so the parser can recognize things like “Berlin, DE” and “Baltimore, MD”
Component dropout: we usually produce 2–3 different versions of the address with various components removed at random. This way the model also has to learn to parse simple “city, state” queries alongside venue addresses, so it won’t get overconfident e.g. that the first token in an address is always a venue name.

Structured learning

In the structured learning, we typically use a linear model to predict the most likely tag for a particular word given some local and contextual attributes or features. What differentiates structured learning from other types of machine learning is that in structured learning, the model’s prediction for the previous word can be used to predict the current word. In similar tasks like part-of-speech tagging or named entity recognition, we typically design “feature functions” which take the following parameters:

The entire sequence of words
The current index in that sequence
The predicted tags for the previous two words

The function then returns a set of features, usually binary, which might help predict the best tag for the given word.

The tag history is what makes sequence learning different from other types of machine learning. Without the tag history, we could come up with the features for each word (even if they use the surrounding words), and use something like logistic regression. In a sequence model, we can actually create features that use the predicted tag of the previous word.

Consider the use of the word “Brooklyn.” In isolation, we could assume it to mean the city, but it could be many other things e.g. Brooklyn Avenue, The Brooklyn Museum, etc. If we see “Brooklyn” and the last tag was HOUSE_NUMBER, it’s very likely to mean Brooklyn the street name. Similarly, if the last tag was HOUSE (our label for place/building name), it’s likely that we’re inside a venue name e.g. “The Brooklyn Museum.”

Features

The simplest and most predictive feature is usually the current word itself, but having the entire sequence means there can be bigram/trigram features, etc. This is especially helpful in a case like “Brooklyn Avenue” where knowing that the next word is “Avenue” may disambiguate words used out of their normal context, or help determine that a rare word is a street name. In a French address, knowing that the previous word was “Avenue” is equally helpful as in “Avenue des Champs-Élysées.”

Training the model for multiple languages entails a few more ambiguities. Take the word “de.” In Spanish it’s a preposition. If we’re lowercasing the training data on the way in, it could also be an abbreviation for Delaware (“wilmington de”) or Deutschland (“berlin de”). Again, knowing the contextual words/tags is quite helpful.

In libpostal, we make heavy use of the multilingual address dictionaries used above in normalization as well as place name dictionaries (aka gazetteers) compiled from GeoNames and OSM. We group known multiword phrases together so e.g. “New York City” will be treated as a single token. For each phrase, we store the set of tags it might refer to (“New York” can be a city or a state), and which one is most likely in the training data. Context features are still necessary though as many streets take their name from a proper place like “Pennsylvania Avenue,” “Calle Uruguay” or “Via Firenze.”

We also employ a common trick to capture patterns in numbers. Rather than consider each number as a separate word or token, we normalize all digits to an uppercase “D” (since we’re lowercasing, this doesn’t conflict with the letter “d”). This allows us to capture useful patterns in numbers and let them share statistical strength. Some examples might be “DDDDD” or “DDDDD-DDDD” which are most likely US postal codes. This way we don’t need many training examples of “90210” specifically, we just know it’s a five digit number. GeoNames contains a world postal code data set, which is also used to identify potential valid postal codes. Some countries like South Africa use 4-digit postal codes, which can be confused for house numbers, and the GeoNames postal codes help disambiguate.

The learning algorithm

We use the averaged perceptron popularized by Michael Collins at Columbia, which achieves close to state-of-the-art accuracy while being much faster to train than fancier models like conditional random fields. On smaller training sets, the additional accuracy might be worth slower training times. On > 50M examples, training speed is non-negotiable.

The basic perceptron algorithm uses a simple error-driven learning procedure, meaning if the current weights are predicting the correct answer, they aren’t modified. If the guess is wrong, then for each feature, one is added to the weight of the correct class and one is subtracted from the weight of the predicted/wrong class. The learning is done online, one example at a time. Since the weight updates are very sparse and occur only when the model makes a mistake, training is very fast.

In the averaged perceptron, the final weights are then averaged across all the iterations. Without averaging it’s possible for the basic perceptron to spend so much of its time altering the weights to accommodate the few examples it gets wrong that it produces an unreasonable set of weights that don’t generalize well to new examples (a.k.a. overfitting). In this way, averaging has a similar effect to regularization in other linear models. As in stochastic gradient descent, the training examples are randomly shuffled before each pass, and we make several passes over the entire training set.

Though quite simple, this method is surprisingly competitive in part-of-speech tagging, the existing NLP task that’s closest to address parsing, and has by far the best speed/accuracy ratio of the bunch.

Evaluation

In part-of-speech tagging, simple per-token accuracy is the most intuitive metric for evaluating taggers and is used in most of the literature. For address parsing, since we’ll want to use the parse results downstream as fields in normalization and search, a single mistake changes the JSON we’ll be constructing from the parse. Consider the following mistake:

Brooklyn/HOUSE Academy/HOUSE of/ROAD Music/HOUSE
30/HOUSE_NUMBER Lafayette/ROAD Avenue/ROAD Brooklyn/CITY NY/STATE 11217/POSTCODE

In a full-text search engine like Elasticsearch, it might still work to search the name field with [“Brooklyn Academy”, “Music”] plus the other fields and still get a correct result, but if we want to create a structured database from the parses or hash the fields and do a simple lookup, this parse is rendered essentially useless.

The evaluation metric we use is full-parse accuracy, meaning the fraction of addresses where the model labels every single token correctly.

On held-out data (addresses not seen during training), the libpostal address parser currently gets 98.9% of full parses correct. That’s a single model across all the languages, variations, and field combinations we have in the OSM training set.

Future improvements

The astute reader will notice that there’s still an open question here: how well does the synthesized training set approximate real geocoder input? While that’s difficult to measure directly, most of the decisions in constructing the training set thus far have been made by examining patterns in real-world addresses extracted from the Common Crawl, as well as user queries contributed to the project by a production geocoder.

There’s still room for improvement of course. Not every country is represented in the address formatting templates (though coverage continues to improve over time). Most notably, countries using the East Asian addressing system like China, Japan, and South Korea are difficult because the address format depends on which language/script is being used, necessitating some structural changes to the address-formatting repo. In OSM these addresses are not always split into components, possibly residing in the “addr:full” tag. However, since each language uses specific characters to delimit address components, it should be possible to parse the full addresses deterministically and use them as training examples.

The libpostal parser also doesn’t yet support apartment/flat numbers as they’re not included in most OSM addresses (or the address format templates for that matter). The parser typically labels them as part of the house number or street field. For geocoders, apartment numbers aren’t likely to turn up much as people tend to search at the level of the house/building number, but they may be unavoidable in batch geocoding. Supporting them would be relatively straightforward either by adding apartment or floor numbers to some of the training examples at random (without regard to whether those apartments actually exist in a particular building or not), or by parsing the “addr:flats” key in OSM. The context phrases like “Apt.” or “Flat” can be randomly sampled from any language in libpostal with a “unit_types” dictionary.

Conclusions

I’m hoping that libpostal will be the backbone for many great geocoders and apps in years to come. With that in mind, it’s been designed to be:

International/multilingual
Technology and stack independent
Based on open data sets and fully open source

International by design, not as an afterthought

Almost every geocoder bakes in various myopic assumptions e.g. that addresses are only in the US, English, Latin script, the Global North, the bourgeoisie, etc.

Fully embracing L10N/I18N (localization/internationalization) means that there is no excuse for excluding people based on the languages they speak or the countries in which they live. An extra degree of rigor is required in recognizing and eliminating our own cultural biases.

There are of course always constraints on time and attention, so libpostal prioritizes languages in a simple, hopefully democratic way. Languages are added in priority order by the number of world addresses they cover, approximated by OpenStreetMap.

Usable on any platform

Libpostal is written in C mostly for reasons of portability. Almost every conceivable programming language can call into C code. There are already libpostal bindings for Python and NodeJS, and it’s quite easy to write bindings for other languages.

Informed completely by open data

Libpostal makes use of several great open data sets to construct training examples for the address parser and language classifier:

OpenStreetMap is used extensively by libpostal to create millions of training examples of parsed addresses and language classifications.
GeoNames is used by the address parser as a place name and postal code gazetteer, and will also be used for geographic name disambiguation in an upcoming release.
Quattroshapes and Zetashapes polygons are used in various places to add additional administrative and local boundary names to the parser training set. Zetashapes neighborhood polygons were particularly useful since neighborhoods are simple points in OSM.

All of the preprocessing code is open source, so researchers wanting to build their own models on top of open geo data sets are welcome to pursue it from any avenue (the puns just keep getting better) they choose.

The beauty of using these living, open, collaboratively edited data sets is that the models in libpostal can be updated and improved as the data sets improve. It also provides a great incentive for users of the library to support and contribute to open data.

Fin

You made it! The only thing left to do, if you haven’t already, is check out libpostal on Github: https://github.com/openvenues/libpostal.

If you want to contribute and help improve libpostal, you don’t have to know C, or any programming language at all for that matter. For non-technical folks, the easiest way to contribute is to check out our language dictionaries, which are simple text files that contain all the abbreviations and phrases libpostal recognizes. They affect both normalization and the parser. Find any language you speak (or add a directory if it’s not listed) and edit away. Your work will automatically be incorporated into the next build.

Libpostal is already scheduled to be incorporated into at least 3 geocoding applications written in as many languages. If you’re using it or considering it for your project/company, let us know.

Happy geocoding!

February 25, 2016 github hosting

How to Use Github for Hosting Files

Learn how to use Github as a free file hosting service. You can upload images, PDFs, document or files of any other form into your Github from the browser.

Github, in simple English, is a website for hosting source code. The site is built for programmers and, if you are not one, it is highly unlikely that you have ever used Github. Repositories and Forks, the basic building blocks of Github, may seem like second-nature to developers but, for everyone else, Github continues to be a complicated beast.

Github isn’t just a place for developers though. The site can be used a writing platform. It can host HTML websites. You can use Github to visually compare the content of two text files. The site’s Gist service can used for anonymous publishing and as a tasklist. There’re so many things do on Github already and you can how use it as a free file hosting service as well.

How to Host Files on Github

It takes few easy steps to turn your Github into a file repository. You can upload files from the browser and you can add collaborators so they can also upload files to a common repository (similar to shared folders in Google Drive). The files are public so anyone can download them with a direct link. The one limitation is that the individual files cannot be larger than 25 MB each. There are no known bandwidth limits though.

Step 1: Go to github.com and sign-up for a free account, if you don’t have one. Choose the free plan as that’s all we need for hosting our files.

Step 2: Click the “New Repository” button, or go to github.com/new, to create a new repository for hosting your files. You can think of a repository as a folder on your computer.

[Github for File Hosting][3]

[Github for File Hosting][4]
Step 3: Give your repository a name and a description and click the Create button. It helps to have a description as it will help others discover your files on the web. You can have Private repositories too but that requires a monthly subscription.

Step 4: Your repository will initially be empty. click the Import Code button on the next screen to initialize the repository.

[Import code into Github][3]

[Import code into Github][5]
Step 5: Paste the URL _https://github.com/labnol/files.git_ into the repository field and click Begin Import to create your Github repository for hosting files.

Upload Files to Github

Your Github repository is now ready. Click the Upload Files files button and begin uploading files. You can drag one or more files from the desktop and then click Commit Changes to publish the files on the web. Github will accept any file as long as the size is within the 25 MB limit.

Github has a built-in previewer for PDF, text and image files (including [animated GIFs][6]) so anyone can view them without downloading the actual file. Else there’s a simple URL hack to get the raw (downloadable) version of any file hosted on Github.

[Upload Files to Github][3]

[Upload Files to Github][7]

Direct URLs for Github Files

After the file has been uploaded to Github, click the filename in the list and you’ll get the file’s URL in the browser’s address. Append ?raw=true to the URL and you get a downloadable / embeddable version.

For instance, if the file URL is github.com/labnol/files/hello.pdf, the direct link to the same file would be github.com/labnol/files/hello.pdf?raw=true. If the uploaded file is an image, you can even embed it in your website using the standard img tag.

Here’s a sample [file repository][8] on Github. The T-Rex image is [here][9] and the direct link is [here][10]. You can go to the Repository settings and add one or more collaborators. They’ll get write access to your repository and can then add or delete files.

[3]: [4]: [5]: [6]: http://www.labnol.org/tag/gif/ [7]: [8]: https://github.com/labnol/files [9]: [10]: https://github.com/labnol/files/blob/master/trex.jpg?raw=true

February 10, 2016 dropbox

15 Things You Didn’t Know You Could Do with Dropbox

Just when you thought Dropbox couldn’t get any better, it has.

Many interesting cloud storage services have come and gone, but Dropbox is probably the one that’s been here the longest. And now it has upped its game with a host of new features. Let’s explore some of them from 2015 as well as some old but lesser-known ones. What we’re saying is let’s discover more stuff that you didn’t know you could do in and with Dropbox.

1. Request Files from Anyone

Sharing files saved in your Dropbox has always been easy. Collecting files in Dropbox from people? Not so much. You had to rely on third-party services for quite a long time…until Dropbox introduced its own file request feature. The best thing about it is that you can gather files even from people who don’t have a Dropbox account. No reason to force them to sign up for one, is there?

To initiate a file request, first head straight to your Dropbox account and click on File Requests in the sidebar to go to the file requests page. See that big blue plus icon there? Click on it create a file request.

file-requests-section

You’ll have to specify a catchall name for the files that you want to collect. Dropbox creates a new folder with this name to direct the incoming files to. You can also use an existing folder instead.

create-file-request

For every file request that you create, you’ll get a unique link to share with the people you want to receive files from. Ensure that you have enough space in your Dropbox account for the incoming files. Otherwise, the person sending the files will encounter an error message.

Don’t worry about the privacy settings for the received files. Only you can see them, and later share them if and when you want to.

I used the @Dropbox File Request feature this morning, and it worked perfectly. Consider me impressed!

— Devon Michael Dundee (@devondundee) January 14, 2016

If you’re on the receiving end of a file request, you’ll get an email with a link to upload the requested files. Click on it and Dropbox will walk you through the straightforward upload process. You’ll have to limit the file size to 2GB if you’re sending it to a Dropbox Basic user and to 10 GB if you’re sending it to a Pro or Business user.

We also recommend giving Balloon a try, if you don’t mind ditching the built-in file request feature in favor of a third-party app.

2. Preview Photoshop and Illustrator Files

Has someone shared a PSD file or an AI file with you on Dropbox? You don’t need access to the right Adobe software to preview it. You can do that right from Dropbox’s web interface, thanks to the interactive file preview feature introduced mid-2015.

Click on the file you want to preview and you’ll get an image toolbar that you can use to zone in on any portion of the preview.

Coolest surprise of the day? Being able to preview an @Illustrator file in @Dropbox on #iOS. Geeeeenius!!

— Sophie Exintaris (@eurydice13) December 3, 2015

You can preview files not only in PSD and AI formats, but also in PNG, JPG, EPS, SVG, and BMP. But, the previews for certain formats like PSD, AI, and SVG will be sharper and clearer than for the rest. The file preview feature also allows you to preview PDFs, slideshows, videos, and more.

pdf-preview

If you’re a creative professional, the preview feature ensures that you don’t have to worry about compressing high-resolution files or converting them to other, more easily viewable formats for sharing with clients. Share a Dropbox link to the design file and be done with it. Your client can preview the file (in full resolution!) and leave feedback on it from Dropbox on the web.

3. Rejoin Shared Folders

Let’s say you left a shared folder, accidentally or otherwise, by deleting it from your Dropbox, and now you want back in. Regaining access to that folder is as simple as clicking on Sharing in the sidebar and then clicking on the Rejoin link next to the folder you want fresh access to.

rejoin-shared-folder

Remember, deleting files inside the shared folder works differently from deleting the shared folder itself. The former will make the files disappear from everybody’s else Dropbox account as well, but then again, anyone with access to the shared folder can restore them.

4. Find Files Faster with Dropbox Recents

You don’t have to dig through folder after folder to find a Dropbox file that you just edited. You’ll find a link to it under Recents in the sidebar. This section keeps an updated list of files that you have opened or modified recently. Share, download, comment, delete, or even view previous versions of the file straight from this list.

dropbox-recents

5. Work as a Team

Many Dropbox users — solopreneurs, for example — use the Basic and Pro versions of Dropbox for business. If you’re one of those users, congratulations. You can now collaborate better on projects using the new Team feature.

After you create a team, you’ll be able to add members to it, share files and folders with them, and create sub-folders for better organization. As the team administrator you get granular control over file and folder permissions. Also, you’re sure to appreciate the ability to link your work and personal Dropbox accounts and switch between them easily without having to log out of either.

Having 2 different Dropbox accounts in one for Personal/Work is awesome. Awesome new Team feature @Dropbox!

— Maarten Busstra (@qtbox) October 28, 2015

Your work projects are not the only ones that can benefit from this collaboration feature. Personal projects also can. Have a family vacation coming up? Or a wedding? Or a friend’s birthday? Create a Dropbox team and get started on the planning!

6. Discuss Files You’re Viewing

You have probably noticed that Dropbox files on the web now come with a commenting mechanism. If you haven’t, shift your attention to the right sidebar when you have a file or file preview open, and there it is.

As is standard procedure on the web these days, you can @mention someone to get their attention, and in this case, to get their inputs on the file. They’ll receive an email notification about it and can leave a comment on the file even if they aren’t a Dropbox user.

comments-section

The added advantage is that if it’s a Microsoft Office file that you’re discussing, you can edit it right there based on the feedback, thanks to the Dropbox-Office Online integration. Your edits will automatically get saved back to Dropbox.

7. Sync Files Faster

By default, Dropbox limits the bandwidth allocated to the files being uploaded to your account. If you want to take advantage of your network’s higher capacity, you can remove this limit altogether or set a custom one from Dropbox’s settings.

To remove bandwidth limits for file uploads on a Mac, first open Preferences from Dropbox’s menu bar icon.

Next, switch to the Network tab and click on the Change Settings button next to Bandwidth: Now select the radio button next to Don’t limit, or if you want to specify a limit, select the radio button next to Limit to and type in an upload speed. You can also limit the download rate from the same section. Hit the Update button once you have made the changes.

dropbox-upload-limit

To access the bandwidth settings on Windows 7 and above, click on the Dropbox icon in the system tray and go to Preferences > Bandwidth.

8. Instantly Delete Sensitive Files for Good…

Files that you delete from your Dropbox don’t disappear immediately from your computer or your Dropbox account. They get queued up for permanent deletion and stay part of the Dropbox ecosystem for at least 30 days. The deleted files also stay in the cache folder (.dropbox.cache) within Dropbox’s root folder on your computer for three days.

Note: If you have a Pro account with Extended Version History, the deleted files stay in the online deletion queue for up to one year.

My #Dropbox is acting weird. Even though I delete a folder, it keeps appearing again. :/

— Arun Sathiya (@iarunsb) January 1, 2016

If the files you deleted contain sensitive data, you might want to clear them out from the deletion queue manually. To do so, go to the home page of your account and click on the trash icon to the left of the search box. This displays the deleted files and they appear grayed out.

Now select a binned file that you want to erase permanently and click on the Permanently delete… option in the menu bar at the top. Do this for each file that you want to erase right away. Of course, you can select multiple files using Ctrl on Windows or cmd on a Mac.

dropbox-permanently-delete-files

Here comes another important step: getting rid of the deleted files from Dropbox’s cache folder. You can’t see this folder unless your system is set to show hidden files. You’ll need to access it and once again delete the files from there to get rid of them for good. Of course, if you do nothing, Dropbox will still clear the cache folder in three days’ time.

Based on whether you’re using Windows, Mac, or Linux, you’ll have to look up Dropbox’s instructions to reveal the cache folder on your computer.

Warning: You can’t recover any of the files you have deleted using the steps above, but someone with access to your computer and a good recovery software might be able to.

Be 100% sure that you want to delete a file before you delete it.
Look for a more advanced security solution to remove even the most deeply hidden remnants of deleted files.

9. Add a 4-Digit PIN to the Dropbox App on Your Mobile

You know all about protecting your Dropbox account with two-factor authentication and you have set it up already, right? Have you also secured the Dropbox app on your phone or tablet with a PIN or passcode? The passcode feature is not new, but it’s one that many people overlook.

Set a passcode for the Dropbox app now via Dropbox settings > Advanced Features > Configure Passcode on your Android device or via Dropbox settings > Passcode Lock on your iPhone. For iPads and Windows tablets, here are the instructions to set a passcode.

dropbox-passcode-android

Are you a Pro user? Then in addition to setting a passcode, you can enable the setting to remotely erase all Dropbox data on that device after 10 failed attempts at entering the correct PIN. This can prove helpful if your phone ever falls into the wrong hands. There’s a catch though. You can proceed with the remote data wipe only if the device is online.

Also, if you’re a Basic user, you have to content yourself with unlinking the lost device by clicking on the “x” icon next to its name under Dropbox Settings > Security > Devices.

10. Carry Your Bookmarks Everywhere

Dropbox being such a great way to sync anything, we have all come up with various makeshift ways to sync bookmarks to the cloud. But we don’t need them anymore, because Dropbox has now added a feature to do just that.

You can now drag and drop links to Dropbox on the web or on your computer. They get backed up just like your files do, so you can open them from any location.

dropbox-bookmark

Unfortunately, clicking on a bookmark from Dropbox’s web interface loads a preview page for the bookmark instead of the link suggested by the bookmark. That’s why we recommend using the bookmark’s context menu to open the link in a new tab.

You’ll really appreciate the convenience of this bookmarking feature when you’re collaborating with someone on a project and have a bunch of shared links to keep track of.

11. Host a Podcast from Dropbox with JustCast

We recently shared an exhaustive guide on how to start a successful podcast. If you’re gearing up to start a podcast yourself and are on the lookout for a decent, easy-to-manage podcast host, your search ends here — with JustCast, which is ridiculously simple to use.

Once you connect JustCast to your Dropbox, a folder named JustCast will appear in /Dropbox/Apps. Any mp3 file you add to Dropbox/Apps/JustCast/podcast_name will automatically go in your podcast’s RSS feed. All you have to do is tell people to subscribe to the feed. Use the in-built metrics feature to track the subscriber and download count.

justcast-workflow

To publish the podcast on iTunes, visit this link for podcast submission and paste the link to your RSS feed there to proceed.

Now let’s talk money. You don’t have to shell out any if you’re content having just three of the most recent episodes showing up in the feed. For unlimited feed items, you have the Pro plan at $5/month.

Here’s something you should make a note of. Dropbox has some restrictions in place on file hosting and sharing. So once your podcast gathers momentum and your audience grows, you’ll need to consider upgrading your Dropbox account to keep up with the increasing number of file downloads.

@badbeef I use JustCast. It takes a dropbox folder and turns it into a Podcast source with little setup. https://t.co/ych9zAbbxn #heynow

— Bt (@mingistech) November 13, 2015

Even if starting a podcast is not in your plans, you can put JustCast to good use by turning it into a personal podcast playlist. Put any MP3 audio files you want to listen to into Dropbox as described above and use the RSS feed in your podcast client — just as you would with any other podcast.

Be mindful of copyright restrictions for any files you’re uploading to Dropbox.

12. Theme Your Dropbox with Orangedox

If you use Dropbox for work, you might want to tweak its interface to align with your brand. And that’s where Orangedox steps in. It gives you tools to add special touches to the Dropbox portal, such as you own logo and color scheme.

Orangedox also allows you to track the documents you have shared and get download stats for them. Note that only this feature is available in the Free Forever plan.

I’m in love with Orangedox! Let’s me track downloads from Dropbox folders…free! http://t.co/1yHN5vMxEC

— Shana Festa (@BookieMonsterSF) October 1, 2014

We must admit that Orangedox has not quite picked up steam despite being launched more than a year ago i.e. in 2014. But considering that there seem to be zero apps that allow you to theme Dropbox, Orangedox is still worth a shot.

13. Create Photo Galleries Using Dropbox Photos with Photoshoot

Okay. We admit that we’re cheating a bit here. You already know of apps that turn your Dropbox photos into galleries. But we had to include Photoshoot in this list because it makes the process so easy.

You drag and drop photos into Dropbox and Photoshoot takes care of creating the gallery, complete with items like thumbnails, titles, dates, and a lightbox display. You can leave the gallery visible to the public or hide it behind a password.

sample-photoshoot-gallery

Professional photographers will get the most out of Photoshoot. If you are one, you’ll be happy to know that the app gives you options to use a custom domain, add your logo, theme the gallery with your brand’s colors, etc. You can even add links to your social networks.

The verdict is that if you’re looking for a hassle-free and elegant way to show off your best work, you’ll fall in love with Photoshoot. Check out a sample gallery here.

14. Skip File Display and Go Straight to File Download

When you click on a Dropbox link you have received, your browser displays the file and gives you an option to download it. But you can force your browser to start downloading the file immediately instead of displaying it first. To do so, you’ll have to change the dl=0 query parameter in the shared link to dl=1.

Let’s say the Dropbox link reads www.dropbox.com/…/URL.webloc?dl=0. Copy-paste it in your browser, change the dl=0 bit at the end of the link text to dl=1 (www.dropbox.com/…/URL.webloc?dl=1) and then hit Enter. Your browser will begin downloading the file right away.

TIL can load files from Dropbox in Safari/iOS ???? pic.twitter.com/ZXJCGiWSEU

— Ricardo Cabello (@mrdoob) October 29, 2015

Want quick access to your Dropbox folders without having to switch to a new Finder window on OS X? The lightweight App Box for Dropbox can help you with that. For $0.99 it places your Dropbox inside a panel that you can display with a single click from the menu bar. Sounds basic? It is. Sounds useful? It’s that too. We wish Windows also had something similar to put the whole of Dropbox in a pop-up panel accessible from the system tray.

Note that there are other similarly named versions of this app in the Mac App Store and they have a similar functionality. It’s not clear if they come from the same developer though. One of the versions is even free. Do your research before you install the app.

What’s in Store for Dropbox in 2016?

From Dropbox tools for the power user to Dropbox etiquette to time-saving Dropbox shortcuts, we poured everything we knew about Dropbox into article after article. And we thought we had covered it all. We were wrong. As you can see, Dropbox is keeping us on our toes and giving us fodder for more articles. We hope it keeps up this pace in future. Happy “Dropboxing”!

Have you been using some of the new features introduced by Dropbox in 2015? Which Dropbox tricks or apps have you come across lately? Give us your best Dropbox tips in the comments.

← Newer Entries Older Entries →