The main media attention in applications of AI, artificial intelligence and machine learning, has been on such application areas as smart traffic, autonomous cars, recommendation algorithms, and expert systems in all kinds of professional work. There are, however, also very interesting developments taking place around photography currently.
There are multiple areas where AI is augmenting or transforming photography. One is in how the software tools that professional and amateur photographers are using are advancing. It is getting all the time easier to select complex areas in photos, for example, and apply all kinds of useful, interesting or creative effects and functions in them (see e.g. what Adobe is writing about this in: https://blogs.adobe.com/conversations/2017/10/primer-on-artificial-intelligence.html). The technical quality of photos is improving, as AI and advanced algorithmic techniques are applied in e.g. enhancing the level of detail in digital photos. Even a blurry, low-pixel file can be augmented with AI to look like a very realistic, high resolution photo of the subject (on this, see: https://petapixel.com/2017/11/01/photo-enhancement-starting-get-crazy/.
But the applications of AI do not stop there. Google and other developers are experimenting with “AI-augmented cameras” that can recognize persons and events taking place, and take action, making photos and videos at moments and topics that the AI, rather than the human photographer deemed as worthy (see, e.g. Google Clips: https://www.theverge.com/2017/10/4/16405200/google-clips-camera-ai-photos-video-hands-on-wi-fi-direct). This development can go into multiple directions. There are already smart surveillance cameras, for example, that learn to recognize the family members, and differentiate them from unknown persons entering the house, for example. Such a camera, combined with a conversant backend service, can also serve the human users in their various information needs: telling whether kids have come home in time, or in keeping track of any out-of-ordinary events that the camera and algorithms might have noticed. In the below video is featured Lighthouse AI, that combines a smart security camera with such an “interactive assistant”:
In the domain of amateur (and also professional) photographer practices, AI also means many fundamental changes. There are already add-on tools like Arsenal, the “smart camera assistant”, which is based on the idea that manually tweaking all the complex settings of modern DSLR cameras is not that inspiring, or even necessary, for many users, and that a cloud-based intelligence could handle many challenging photography situations with better success than a fumbling regular user (see their Kickstarter video at: https://www.youtube.com/watch?v=mmfGeaBX-0Q). Such algorithms are already also being built into the cameras of flagship smartphones (see, e.g. AI-enhanced camera functionalities in Huawei Mate 10, and in Google’s Pixel 2, which use AI to produce sharper photos with better image stabilization and better optimized dynamic range). Such smartphones, like Apple’s iPhone X, typically come with a dedicated chip for AI/machine learning operations, like the “Neural Engine” of Apple. (See e.g. https://www.wired.com/story/apples-neural-engine-infuses-the-iphone-with-ai-smarts/).
Many of these developments point the way towards a future age of “computational photography”, where algorithms play as crucial role in the creation of visual representations as optics do today (see: https://en.wikipedia.org/wiki/Computational_photography). It is interesting, for example, to think about situations where photographic presentations are constructed from data derived from myriad of different kinds of optical sensors, scattered in wearable technologies and into the environment, and who will try their best to match the mood, tone or message, set by the human “creative director”, who is no longer employed as the actual camera-man/woman. It is also becoming increasingly complex to define authorship and ownership of photos, and most importantly, the privacy and related processing issues related to the visual and photographic data. – We are living interesting times…
In the 1970s and 1980s the concept ‘cognitive engineering’ was used in the industry labs to describe an approach trying to apply cognitive science lessons to the design and engineering fields. There were people like Donald A. Norman, who wanted to devise systems that are not only easy, or powerful, but most importantly pleasant and even fun to use.
One of the classical challenges of making technology suit humans, is that humans change and evolve, and differ greatly in motivations and abilities, while technological systems tend to stay put. Machines are created in a certain manner, and are mostly locked within the strict walls of material and functional specifications they are based on, and (if correctly manufactured) operate reliably within those parameters. Humans, however, are fallible and changeable, but also capable of learning.
In his 1986 article, Norman uses the example of a novice and experienced sailor, who greatly differ in their abilities to take the information from compass, and translate that into a desirable boat movement (through the use of tiller, and rudder). There have been significant advances in multiple industries in making increasingly clear and simple systems, that are easy to use by almost anyone, and this in turn has translated into increasingly ubiquitous or pervasive application of information and communication technologies in all areas of life. The televisions in our living rooms are computing systems (often equipped with apps of various kinds), our cars are filled with online-connected computers and assistive technologies, and in our pockets we carry powerful terminals into information, entertainment, and into the ebb and flows of social networks.
There is, however, also an alternative interpretation of what ‘cognitive engineering’ could be, in this dawning era of pervasive computing and mixed reality. Rather than only limited to engineering products that attempt to adapt to the innate operations, tendencies and limitations of human cognition and psychology, engineering systems that are actively used by large numbers of people also means designing and affecting the spaces, within which our cognitive and learning processes will then evolve, fit in, and adapt into. Cognitive engineering does not only mean designing and manufacturing certain kinds of machines, but it also translates into an impact that is made into the human element of this dialogical relationship.
Graeme Kirkpatrick (2013) has written about the ‘streamlined self’ of the gamer. There are social theorists who argue that living in a society based on computers and information networks produces new difficulties for people. Social, cultural, technological and economic transitions linked with the life in late modern, capitalist societies involve movements from projects to new projects, and associated necessity for constant re-training. There is necessarily no “connecting theme” in life, or even sense of personal progression. Following Boltanski and Chiapello (2005), Kirkpatrick analyses the subjective condition where life in contradiction – between exigency of adaptation and demand for authenticity – means that the rational course in this kind of systemic reality is to “focus on playing the game well today”. As Kirkpatrick writes, “Playing well means maintaining popularity levels on Facebook, or establishing new connections on LinkedIn, while being no less intensely focused on the details of the project I am currently engaged in. It is permissible to enjoy the work but necessary to appear to be enjoying it and to share this feeling with other involved parties. That is the key to success in the game.” (Kirkpatrick 2013, 25.)
One of the key theoretical trajectories of cognitive science has been focused on what has been called “distributed cognition”: our thinking is not only situated within our individual brains, but it is in complex and important ways also embodied and situated within our environments, and our artefacts, in social, cultural and technological means. Gaming is one example of an activity where people can be witnessed to construct a sense of self and its functional parameters out of resources that they are familiar with, and which they can freely exploit and explore in their everyday lives. Such technologically framed play is also increasingly common in working life, and our schools can similarly be approached as complex, designed and evolving systems that are constituted by institutions, (implicit, as well as explicit) social rules and several layers of historically sedimented technologies.
Beyond all hype of new commercial technologies related to virtual reality, augmented reality and mixed reality technologies of various kinds, lies the fact that we have always already lived in complex substrate of mixed realities: a mixture of ideas, values, myths and concepts of various kinds, that are intermixed and communicated within different physical and immaterial expressive forms and media. Cognitive engineering of mixed reality in this, more comprehensive sense, involves involvement in dialogical cycles of design, analysis and interpretation, where practices of adaptation and adoption of technology are also forming the shapes these technologies are realized within. Within the context of game studies, Kirkpatrick (2013, 27) formulates this as follows: “What we see here, then, is an interplay between the social imaginary of the networked society, with its distinctive limitations, and the development of gaming as a practice partly in response to those limitations. […] Ironically, gaming practices are a key driver for the development of the very situation that produces the need for recuperation.” There are multiple other areas of technology-intertwined lives where similar double bind relationships are currently surfacing: in social use of mobile media, in organisational ICT, in so-called smart homes, and smart traffic design and user culture processes. – A summary? We live in interesting times.
– Boltanski, Luc, ja Eve Chiapello (2005) The New Spirit of Capitalism. London & New York: Verso.
– Kirkpatrick, Graeme (2013) Computer Games and the Social Imaginary. Cambridge: Polity.
– Norman, Donald A. (1986) Cognitive engineering. User Centered System Design, 31(61).
Portable Document Format (PDF) files are a pretty standard element in academic and business life these days. It is sort of a compromise, a tool for living life that is partly based on traditional paper documents and their conventions, and part on new, digital functionalities. A PDF file should maintain the appearance of the document same, as moved from device to device and user to user, and it can facilitate various more advanced functionalities.
One such key function is ability to sign a document (an agreement, a certificate, or such) with digital signatures. This can greatly speed up many critical processes in contemporary, global, mobile and distributed lives of individuals and organisations. Rather than waiting for a key person to arrive back from trip to their office, to physically use pen and paper to sign a document, a PDF document version of the document (for example) can be just mailed to the person, who then adds their digital signature to the file, saves, and sends the signed version back.
In legal and technical terms, there is nothing stopping from moving completely to using digital signatures. There are explanations of the legal situation e.g. here:
And Adobe, the leading company in electronic documents business, provides step-by-step instructions on how to add or generate the cryptographic mechanisms to ensure the authenticity of digital signatures in PDFs with their Acrobat toolset:
According to my experience, most contracts and certificates still are required to be signed with a physical pen, ink, and paper, even while the digital tools exist. The reasons are not legal or technical, but rather rooted in organisation routines and processes. Many traditional organisations are still not “digital” or “paperless”, but rather build upon decades (or: centuries!) of paper-trail. If the entire workflow is built upon the authority of authentic, physically signed contracts and other legal (paper) documents, it is hard to transform the system. At the same time, the current situation is far from optimal: in many cases there is double work, as everything needs to exist both as the physical papers (for signing, and for paper-based archiving), and then scanned into PDFs (for distribution, in intranets, in email, in other electronic archives that people use in practice).
There are useful new tools like Kami (https://www.kamihq.com/) that facilitate move to “paperless classroom”, with their easy to use functions for drawing, editing, and commenting on PDFs (Adobe’s business oriented solutions are not the best answer to all users and situations)
(This is the first post in a planned series, focusing on various aspects of contemporary information and communication technologies.)
The contemporary computing is all about flow of information: be it a personal computer, a mainframe server, a mobile device or even an embedded system in a vehicle, for example, the computers of today are not isolated. Be it for better or worse, increasingly all things are integrated into world-wide networks of information and computation. This also means that the ports and interfaces for all that data transfer take even higher prominence and priority, than in the old days of more locally situated processing.
Thinking about transfer of data, some older generation computer users still might remember things like floppy disks or other magnetic media, that were used both for saving the work files, and often distributing and sharing that work with others. Later, optical disks, external hard drives, and USB flash drives superseded floppies, but a more fundamental shift was brought along by Internet, and “cloud-based” storage options. In some sense the development has meant that personal computing has returned to the historical roots of distributed computing in ARPANET and its motivation in sharing of computing resources. But regardless what kind of larger network infrastructure mediates the operations of user and the service provider, all that data still needs to flow around, somehow.
The key technologies for information and communication flows today appear to be largely wireless. The mobile phone and tablet communicate to the networks with wireless technologies, either WiFi (wireless local area networking) or cellular networks (GSM, 3G and their successors). However, all those wireless connections end up linking into wired backbone networks, that operate at much higher speeds and reliability standards, than the often flaky, local wireless connections. As data algorithms for coding, decoding and compression of data have evolved, it is possible to use wireless connections today to stream 4K Ultra HD video, or to play high speed multiplayer games online. However, in most cases, wired connections will provide lower latency (meaning more immediate response), better reliability from errors and higher speeds. And while there are efforts to bring wireless charging to mobile phones, for example, most of the information technology we use today still needs to be plugged into some kind of wire for charging its batteries, at least.
This is where new standards like USB-C and Thunderbolt come to the picture. Thunderbolt (currently Thunderbolt 3 is the most recent version) is a “hardware interface”, meaning it is a physical, electronics based system that allows two computing systems to exchange information. This is a different thing, though, from the actual physical connector: “USB Type C” is the full name of the most recent reincarnation of “Universal Serial Bus”, an industry standard of protocols, cables, and connectors that were originally released already in 1996. The introduction of original USB was a major step into the interoperability of electronics, as the earlier situation had been developing into a jungle of propriety, non-compatible connectors – and USB is a major success story, with several billion connectors (and cables) shipped every year. Somewhat confusingly, the physical, bi-directional connectors of USB-C can hide behind them many different kinds of electronics, so that some USB-C connectors comply with USB 3.1 mode (with data transfer speeds up to 10 Gbit/s in “USB 3.1 Gen 2” version) and some are implemented with Thunderbolt – and some support both.
USB-C and Thunderbolt have in certain sense achieved a considerable engineering marvel: with backward compatibility to older USB 2.0 mode devices, this one port and cable should be able to connect to multiple displays with 4K resolutions, external data storage devices (with up to 40 Gbit/s speeds), while also working as a power cable: with Thunderbolt support, a single USB-C type port can serve, or drain, up to 100 watts electric power – making it possible to remove separate power connectors, and share power bricks between phones, tablets, laptop computers and other devices. The small form factor Apple MacBook (“Retina”, 2015) is an example of this line of thinking. One downside for the user of this beautiful simplicity of a single port in the laptop is need for carrying various adapters to connect with anything outside of the brave new USB-C world. In an ideal situation, however, it would be a much simpler life if there would only be this one connector type to worry about, and it would be possible to use a single cable to dock any device to the network, gain access to large displays, storage drives, high speed networks, and even external graphics solutions.
The heterogeneity and historical layering of everyday technologies are complicating the landscape that electronics manufacturers would like to paint for us. As any student of history of science and technology can tell, even the most successful technologies did not replace the earlier ones immediately, and there has always been reasons why people have been opposing the adoption of new technologies. For USB-C and Thunderbolt, the process of wider adoption is clearly currently well underway, but there are also multiple factors that slow it down. The most typical peripheral does not yet come with USB-C, but rather with the older versions. Even in expensive, high end mobile phones, there are still multiple models that manufacturers ship with older USB connectors, rather than with the new USB-C ones.
A potentially more crucial issue for most regular users is that Thunderbolt 3 & USB-C is still relatively new and immature technology. The setup is also rather complex, and with its integration of DisplayPort (video), PCI Express (PCIe, data) and DC power into a single hardware interface it typically requires multiple manufacturers’ firmware and driver updates to work seamlessly together, for TB3 magic to start happening. An integrated systems provider such as Apple has best possibilities to make this work, as they control both hardware as well as software of their macOS computers. Apple is also, together with Intel, the developer of the original Thunderbolt, and the interface was first commercially made available in the 2011 version of MacBook Pro. However, today there is an explosion of various USB-C and Thunderbolt compatible devices coming to the market from multiple manufacturers, and the users are eager to explore the full potential of this new, high speed, interoperable wired ecosystem.
eGPU, or External Graphics Processing Unit, is a good example of this. There are entire hobbyist forums like eGPU.io website dedicated to the fine art of connecting a full powered, desktop graphics card to a laptop computer via fast lane connections – either Expresscard or Thunderbolt 3. The rationale for this is (apart from the sheer joy of tweaking) that in this manner, one can both have a slim ultrabook computer for daily use, with a long battery life, that is then capable of transforming into an impressive workstation or gaming machine, when plugged into an external enclosure that houses the power hungry graphics card (these TB3 boxes typically have full length PCIe slots for installing GPUs, different sets of connection ports, and a separate desktop PC style power supply). VR (virtual reality) applications are one example of an area where current generation of laptops have problems: while there are e.g. Nvidia GeForce GTX 10 series (1060 etc.) equipped laptops available today, most of them are not thin and light for everyday mobile use, or, if they are, their battery life and/or fan noise present issues.
Razer, a American-Chinese computing hardware manufacturer is known as a pioneer in popularizing the field of eGPUs, with their introduction of Razer Blade Stealth ultrabook, which can be plugged with a TB3 cable into the Razer Core enclosure (sold separately), for utilizing powerful GPU cards that can be installed inside the Core unit. A popular use case for TB3/eGPU connections is for plugging a powerful external graphics card into a MacBook Pro, in order to make it into a more capable gaming machine. In practice, the early adopters have faced struggles with firmwares and drivers that do not provide direct support from either the macOS side, or from the eGPU unit for the Thunderbolt 3 implementation to actually work. (See e.g. https://egpu.io/akitio-node-review-the-state-of-thunderbolt-3-egpu/ .) However, more and more manufacturers have added support and modified their firmware updates, so the situation is already much better than a few months ago (see instructions at: https://egpu.io/setup-guide-external-graphics-card-mac/ .) In the area of PC laptops running Windows 10, the situation is comparable: a work in progress, with more software support slowly emerging. Still, it is easy to get lost in this, still evolving field. For example, Dell revealed in January that they had restricted the Thunderbolt 3 PCIe data lanes in their implementation of the premium XPS 15 notebook computer: rather than using full 4 lanes, XPS 15 had only 2 PCIe lanes connected in the TB3. There is e.g. this discussion in Reddit comparing the effects this has, in the typical case that eGPU is feeding image into an external display, rather than back to the internal display of the laptop computer (see: https://www.reddit.com/r/Dell/comments/5otmir/an_approximation_of_the_difference_between_x2_x4/). The effects are not that radical, but it is one of the technical details that the early users of eGPU setups have struggled with.
While fascinating from an engineering or hobbyist perspective, the situation of contemporary technologies for connecting the everyday devices is still far from perfect. In thousands of meeting rooms and presentation auditoriums every day, people fail to connect their computers, get anything into the screen, or get access to their presentation due to the failures of online connectivity. A universal, high speed wireless standard for sharing data and displaying video would no doubt be the best solution for all. Meanwhile, a reliable and flexible, high speed standard in wired connectivity would go a long way already. The future will show whether Thunderbolt 3 can reach that kind of ubiquitous support. The present situation is pretty mixed and messy at best.
I am a regular user of headphones of various kinds, both wired and wireless, closed and open, with noise cancellation, and without. The latest piece of this technology I invested in are the “AirPods” by Apple.
Externally, these things are almost comically similar to the standard “EarPods” they provide with, or as the upgrade option for their mobile devices. The classic white Apple design is there, just the cord has been cut, leaving the connector stems protruding from the user ears, like small antennas (which they probably also indeed are, as well as directional microphone arms).
There are wireless headphone-microphone sets that have slightly better sound quality (even if AirPods are perfectly decent as wireless earbuds), or even more neutral design. What is here interesting in one part is the “seamless” user experience which Apple has invested in – and the “artificial intelligence” Siri assistant which is another key part of the AirPod concept.
The user experience of AirPods is superior to any other headphones I have tested, which is related to the way the small and light AirPods immediatelly connect with the Apple iPhones, detect when they are placed into the ear, or or not, and work hours on one charge – and quickly recharge after a short session inside their stylishly designed, smart battery case. These things “just work”, in the spirit of original Apple philosophy. In order to achieve this, Apple has managed to create a seamless combination of tiny sensors, battery technology, and a dedicated “W1 chip” which manages the wireless functionalities of AirPods.
The integration with Siri assistant is the other key part of AirPod concept, and the one that probably divides user’s views more than any other feature. A double tap to the side of an AirPod activates Siri, which can indeed understand short commands in multiple languages, and respond to them, carrying out even simple conversations with the user. Talking to an invisible assistant is not, however, part of today’s mobile user cultures – even if Spike Jonze’s film “Her” (2013) shows that the idea is certainly floating around today. Still, mobile devices are often used while on the move, in public places, in buses, trains or in airplanes, and it is just not feasible nor socially acceptable that people carry out constant conversations with their invisible assistants in this kind of environments – not yet today, at least.
Regardless of this, Apple AirPods are actually to a certain degree designed to rely on such constant conversations, which both makes them futuristic and ambitious, but also a rather controversial piece of design and engineering. Most notably, there are no physical buttons or other ways for adjusting volume in these headphones: you just double tap to the side of AirPods, and verbally tell Siri to turn the volume up, or down. This mostly works just fine, Siri does the j0b, but a small touch control gesture would be just so much more user friendly.
There is something engaging in testing Siri with the AirPods, nevertheless. I did find myself walking around the neighborhood, talking to the air, and testing what Siri can do. There are already dozens of commands and actions that can be activated with the help of AirPods and Siri (there is no official listing, but examples are given in lists like this one: https://www.cnet.com/how-to/the-complete-list-of-siri-commands/). The abilities of Siri still fall short in many areas, it did not completely understand Finnish I used in my testing, and the integration of third party apps is often limited, which is a real bottleneck, as these apps are what most of us are using our mobile devices for, most of the time. Actually, Google and the assistant they have in Android is better than Siri in many areas relevant for daily life (maps, traffic information, for example), but the user experience of their assistant is not yet as seamless or integrated whole as that of Apple’s Siri is.
All this considered, using AirPods is certainly another step into the general developmental direction where pervasive computing, AI, conversational interfaces and augmented reality are taking us, in good or bad. Well worth checking out, at least – for more in Apple’s own pages, see: http://www.apple.com/airpods/.
Update: the new design is now live at: www.unet.fi. – My current university side home pages are from year 2006, so there is a decade of Internet and WWW evolution looming over them. Static HTML is not so bad in itself – it is actually fast and reliable, as compared to some more flaky ways of doing things. However, people access online content increasingly with mobile devices and getting a more “responsive” design (that is, web page design code that scales and adapts content into small or large screen devices differently) is clearly in order.
When one builds institutional home pages as part of the university or other organisation infrastructure, there are usually various technical limitations or other issues, so also in this case. While I have a small “personnel card” style, official contact page in our staff directory, I have wanted my personal home pages to include more content that would reflect my personal interests, publication activity, and to carry links to various resources that I find important or relevant. Our IT admin, however, has limited the WWW server technologies to a pretty minimal set, and there is not, for example “mod_rewrite” module loaded to the Apache that serves our home pages. That means that my original idea to go with a “flat file CMS” to create the new pages (e.g. Kirby: https://getkirby.com/) did not work. There was only one CMS that worked without mod_rewrite that I could find (CMSimple: https://www.cmsimple.org/), and testing that was pain (it was too clumsy and limited in terms of design templates and editing functions for my, non-coder tastes). The other main alternative was to set up a CMS that relies on an actual database (MySQL or similar), but that was forbidden from personal home pages in our university, too.
For a while I toyed with an idea that I would actually set up a development server of my own, and use it to generate static code that I would then publish on the university server. Jekyll (https://jekyllrb.com/) was most promising option in that area. I did indeed spend few hours (after kids have gone to bed) in setting up a development environment into my Surface Pro 4, building on top of the Bash/Ubuntu subsystem, adding Python, Ruby, etc., but there was some SSH public key signing bug that broke the connection to GitHub, which is pretty essential for running Jekyll. Debugging that road proved to be too much for me – the “Windows Subsystem for Linux” is still pretty much a work-in-progress thing. Then I also tried to set up an Oracle VM VirtualBox with WordPress built in, but that produced some other, interesting problems of its own. (It just also might be a good idea to use something a bit more powerful than Surface Pro for running multiple server, photo editing and other tools at the same time – but for many things, this tablet is actually surprisingly good.)
Currently, the plan is that I will develop my new home pages in WordPress, using a commercial “Premium” theme that comes with actual tutorials on how to use and adapt it for my needs (plus they promise support, when I’ll inevitably lose my way). In last couple of days, I have made decent progress using the Microsoft Webmatric package, which includes an IIS server, and pretty fully featured WordPress that runs on top of that (see: http://ivanblagojevic.com/how-to-install-wordpress-on-windows-10-localhost/). I have installed the theme of my choice, and plugins it requires, and started the selection and conversion of content for the new framework. Microsoft, however, has decided to discontinue Webmatrix, and the current setup seems bit buggy, which makes actual content production somewhat frustrating. The server can suddenly lose reading rights to some key graphics file, for example. Or a WordPress page with long and complex code starts breaking down at some point, so that it fails to render correctly. For example, when I had reached about the half way point in creating the code and design for my publications page, the new text and graphics started appearing again from the top of the page, on top of the text that was there already!
I will probably end up setting up the home pages into another server, where I can actually get a full Apache, with mod_rewrite, MySQL and other necessary functions for implementing WordPress pages. In UTA home pages there would then be a redirect code that would show the way to the new pages. This is not optimal, since the search engines will not find my publications and content any more under the UTA.fi domain, but this is perhaps the simplest solution in getting the functionalities I want to actually run as they should. Alternatively, there are some ways to turn a WordPress site into static HTML pages, which can then be uploaded to the UTA servers. But I do not hold my breath whether all WordPress plugins and other more advanced features would work that way.