Photography and artificial intelligence

Google Clips camera
Google Clips camera (image copyright: Google).

The main media attention in applications of AI, artificial intelligence and machine learning, has been on such application areas as smart traffic, autonomous cars, recommendation algorithms, and expert systems in all kinds of professional work. There are, however, also very interesting developments taking place around photography currently.

There are multiple areas where AI is augmenting or transforming photography. One is in how the software tools that professional and amateur photographers are using are advancing. It is getting all the time easier to select complex areas in photos, for example, and apply all kinds of useful, interesting or creative effects and functions in them (see e.g. what Adobe is writing about this in: https://blogs.adobe.com/conversations/2017/10/primer-on-artificial-intelligence.html). The technical quality of photos is improving, as AI and advanced algorithmic techniques are applied in e.g. enhancing the level of detail in digital photos. Even a blurry, low-pixel file can be augmented with AI to look like a very realistic, high resolution photo of the subject (on this, see: https://petapixel.com/2017/11/01/photo-enhancement-starting-get-crazy/.

But the applications of AI do not stop there. Google and other developers are experimenting with “AI-augmented cameras” that can recognize persons and events taking place, and take action, making photos and videos at moments and topics that the AI, rather than the human photographer deemed as worthy (see, e.g. Google Clips: https://www.theverge.com/2017/10/4/16405200/google-clips-camera-ai-photos-video-hands-on-wi-fi-direct). This development can go into multiple directions. There are already smart surveillance cameras, for example, that learn to recognize the family members, and differentiate them from unknown persons entering the house, for example. Such a camera, combined with a conversant backend service, can also serve the human users in their various information needs: telling whether kids have come home in time, or in keeping track of any out-of-ordinary events that the camera and algorithms might have noticed. In the below video is featured Lighthouse AI, that combines a smart security camera with such an “interactive assistant”:

In the domain of amateur (and also professional) photographer practices, AI also means many fundamental changes. There are already add-on tools like Arsenal, the “smart camera assistant”, which is based on the idea that manually tweaking all the complex settings of modern DSLR cameras is not that inspiring, or even necessary, for many users, and that a cloud-based intelligence could handle many challenging photography situations with better success than a fumbling regular user (see their Kickstarter video at: https://www.youtube.com/watch?v=mmfGeaBX-0Q). Such algorithms are already also being built into the cameras of flagship smartphones (see, e.g. AI-enhanced camera functionalities in Huawei Mate 10, and in Google’s Pixel 2, which use AI to produce sharper photos with better image stabilization and better optimized dynamic range). Such smartphones, like Apple’s iPhone X, typically come with a dedicated chip for AI/machine learning operations, like the “Neural Engine” of Apple. (See e.g. https://www.wired.com/story/apples-neural-engine-infuses-the-iphone-with-ai-smarts/).

Many of these developments point the way towards a future age of “computational photography”, where algorithms play as crucial role in the creation of visual representations as optics do today (see: https://en.wikipedia.org/wiki/Computational_photography). It is interesting, for example, to think about situations where photographic presentations are constructed from data derived from myriad of different kinds of optical sensors, scattered in wearable technologies and into the environment, and who will try their best to match the mood, tone or message, set by the human “creative director”, who is no longer employed as the actual camera-man/woman. It is also becoming increasingly complex to define authorship and ownership of photos, and most importantly, the privacy and related processing issues related to the visual and photographic data. – We are living interesting times…

Cognitive engineering of mixed reality

 

iOS 11: user-adaptable control centre, with application and function shortcuts in the lock screen.
iOS 11: user-adaptable control centre, with application and function shortcuts in the lock screen.

In the 1970s and 1980s the concept ‘cognitive engineering’ was used in the industry labs to describe an approach trying to apply cognitive science lessons to the design and engineering fields. There were people like Donald A. Norman, who wanted to devise systems that are not only easy, or powerful, but most importantly pleasant and even fun to use.

One of the classical challenges of making technology suit humans, is that humans change and evolve, and differ greatly in motivations and abilities, while technological systems tend to stay put. Machines are created in a certain manner, and are mostly locked within the strict walls of material and functional specifications they are based on, and (if correctly manufactured) operate reliably within those parameters. Humans, however, are fallible and changeable, but also capable of learning.

In his 1986 article, Norman uses the example of a novice and experienced sailor, who greatly differ in their abilities to take the information from compass, and translate that into a desirable boat movement (through the use of tiller, and rudder). There have been significant advances in multiple industries in making increasingly clear and simple systems, that are easy to use by almost anyone, and this in turn has translated into increasingly ubiquitous or pervasive application of information and communication technologies in all areas of life. The televisions in our living rooms are computing systems (often equipped with apps of various kinds), our cars are filled with online-connected computers and assistive technologies, and in our pockets we carry powerful terminals into information, entertainment, and into the ebb and flows of social networks.

There is, however, also an alternative interpretation of what ‘cognitive engineering’ could be, in this dawning era of pervasive computing and mixed reality. Rather than only limited to engineering products that attempt to adapt to the innate operations, tendencies and limitations of human cognition and psychology, engineering systems that are actively used by large numbers of people also means designing and affecting the spaces, within which our cognitive and learning processes will then evolve, fit in, and adapt into. Cognitive engineering does not only mean designing and manufacturing certain kinds of machines, but it also translates into an impact that is made into the human element of this dialogical relationship.

Graeme Kirkpatrick (2013) has written about the ‘streamlined self’ of the gamer. There are social theorists who argue that living in a society based on computers and information networks produces new difficulties for people. Social, cultural, technological and economic transitions linked with the life in late modern, capitalist societies involve movements from projects to new projects, and associated necessity for constant re-training. There is necessarily no “connecting theme” in life, or even sense of personal progression. Following Boltanski and Chiapello (2005), Kirkpatrick analyses the subjective condition where life in contradiction – between exigency of adaptation and demand for authenticity – means that the rational course in this kind of systemic reality is to “focus on playing the game well today”. As Kirkpatrick writes, “Playing well means maintaining popularity levels on Facebook, or establishing new connections on LinkedIn, while being no less intensely focused on the details of the project I am currently engaged in. It is permissible to enjoy the work but necessary to appear to be enjoying it and to share this feeling with other involved parties. That is the key to success in the game.” (Kirkpatrick 2013, 25.)

One of the key theoretical trajectories of cognitive science has been focused on what has been called “distributed cognition”: our thinking is not only situated within our individual brains, but it is in complex and important ways also embodied and situated within our environments, and our artefacts, in social, cultural and technological means. Gaming is one example of an activity where people can be witnessed to construct a sense of self and its functional parameters out of resources that they are familiar with, and which they can freely exploit and explore in their everyday lives. Such technologically framed play is also increasingly common in working life, and our schools can similarly be approached as complex, designed and evolving systems that are constituted by institutions, (implicit, as well as explicit) social rules and several layers of historically sedimented technologies.

Beyond all hype of new commercial technologies related to virtual reality, augmented reality and mixed reality technologies of various kinds, lies the fact that we have always already lived in complex substrate of mixed realities: a mixture of ideas, values, myths and concepts of various kinds, that are intermixed and communicated within different physical and immaterial expressive forms and media. Cognitive engineering of mixed reality in this, more comprehensive sense, involves involvement in dialogical cycles of design, analysis and interpretation, where practices of adaptation and adoption of technology are also forming the shapes these technologies are realized within. Within the context of game studies, Kirkpatrick (2013, 27) formulates this as follows: “What we see here, then, is an interplay between the social imaginary of the networked society, with its distinctive limitations, and the development of gaming as a practice partly in response to those limitations. […] Ironically, gaming practices are a key driver for the development of the very situation that produces the need for recuperation.” There are multiple other areas of technology-intertwined lives where similar double bind relationships are currently surfacing: in social use of mobile media, in organisational ICT, in so-called smart homes, and smart traffic design and user culture processes. – A summary? We live in interesting times.

References:
– Boltanski, Luc, ja Eve Chiapello (2005) The New Spirit of Capitalism. London & New York: Verso.
– Kirkpatrick, Graeme (2013) Computer Games and the Social Imaginary. Cambridge: Polity.
– Norman, Donald A. (1986) Cognitive engineering. User Centered System Design31(61).

Drone from China: GeekBuying.com

Some experiences from international trade: in late June, I ordered a “drone” – a remote controlled quadcopter – from Chinese seller GeekBying.com. The drone in question was MJX Bugs 2 model, with GPS, 1080P camera, altitude hold, and other nice features, and GeekBying was advertising the best price.

GeekBying changed the delivery company from TNT that I had asked to DHL, but I finally got the drone, at 10th July. It appeared to be a fine little device and worked fine – for two minutes. Then it run out of battery, and a key problem emerged: the battery did not charge with the provided charger. The drone remained dead.

I contacted GeekBuying and their “After Sales Service”, and they responded by asking photo or video evidence of the problem. I made a video where I showed how connecting the charger to the battery does nothing. There was a wait (of ten days) after which they said that they had “contacted the manufacturer” and that are convinced that this is a battery problem. However, a battery is small and “easy to lose during the way” so they wanted me to make another order, where the replacement battery could be combined. This sounded a bit odd. I said that thank you, but I am not interested to order something else at the moment, but I would appreciate if they could just send me the replacement battery.

Another long wait. Finally, in 11th August, I got another small package from China, with the replacement battery Geekbuying had sent me. There is a photo below, showing the original Bugs 2 battery, and the “replacement”.

Batteries: original Bugs 2 battery, and the Geekbuying "replacement part"
Batteries: original Bugs 2 battery, and the GeekBuying.com’s “replacement part”. Make a guess, which is which?

I mailed the GeekBuying After Sales Service again, explaining that the replacement battery was a completely wrong one, and that they had made a mistake. I did send them photos of both batteries, side by side, and explained that the replacement battery was of wrong capacity (750 mAh vs. 1800 mAh of the original), and that it was also of wrong shape, as the original Bugs 2 battery is specially designed to lock into the battery compartment of the drone. The whole deal was starting to smell fishy, and I asked for instructions to return the drone, and get a refund.

GeekBuying responded by email “We are sorry for that the battery is not original, as there is no original battery in manufacturer. We confirmed it and the battery can work on this drone as well, pls try it first.”

I checked their website, and they actually  were themselves advertising the original, 1800 mAh capacity Bugs 2 battery to be sold as a spare part (link here). In my response, I explained this, and said that I am not willing to “try” using a drone with a battery that is not designed for it: even while with a right voltage and connector, the drone might operate for a couple of minutes, this small battery does not lock into the Bugs 2 battery compartment. It would be dangerous to fly a drone with it, as the battery might just disconnect, and the drone could drop on something – or someone. I also considered it fraudulent practice to mail me a wrong battery, and claim that the manufacturer has no suitable battery, as they themselves openly advertise and sell the correct, original battery.

At this point I escalated the issue in PayPal.com into a claim. I had used PayPal in online shopping, because they advertise certain level of buyer protection.

Even after this, the only responses I got from GeekBuying.com were emails asking me to use the wrong, small battery, and send them some videos showing how it is operating. Even a single look at the photo (above) would be enough to point out that this makes no sense.

I thought that most obvious rotten practices would had been rooted out from online shopping – at least with big online stores, but this experience at least suggest otherwise. GeekBuying as a seller has been trying me to make further orders, so that it would make better financial sense for them to post the replacement battery to the faulty product they had sold and shipped. And, as I refused to make further orders, they deliberately posted a wrongly designed, smaller battery as a replacement – something that might even put the persons using the wrong battery while flying a drone into physical danger.

It will be interesting to see if I will get any refund from the drone, in the end. There is the added complication that products with lithium ion batteries can usually be shipped from China, as they come in cargo planes. But – as the kind lady in local post office today explained to me – an individual might have trouble shipping them back, due to the tighter safety regulations of regular airmail. I tried disabling the batteries (using sticky tape) and got the drone and both batteries submitted as a post package back into seller in China, but if the delivery company refuses to carry them, then I will not get a “confirm receipt of the merchandise” from GeekBuying, and it is unclear if PayPal will cover me, in that case. Also, even while PayPal advertises “Refunded Returns”, with free shipping worldwide, the actual claim notice I got from them says that I am personally responsible for all shipping costs.

At this point: the “cheap price” I got from GeekBuying.com has grown quite a bit:

  • drone price: 106,81 euros
  • shipping (from China): 22,43 euros
  • Finnish customs & DHL service fee: 31.04 euros
  • return shipping fee: 43,00 euros
  • total = 203,28 euros.

And: all the used time, energy and peace of mind for all of this? Priceless?

Edit: finally in October, after initially failing to verify that I had indeed returned the drone to the seller, PayPal in the end (after me resubmitting the claim with photographic evidence) concluded that yes, there was indeed faulty product and wrong replacement battery, and that I had returned it to the seller, and they returned me the drone price. I had lost all the other costs, and all time and energy required.

Server Update: Elementary Error?

I have been running a Windows server in our basement pretty much nonstop since 2008. Originally a personal Web server, this HP Proliant machine has in recent years mostly worked as a LAN file server for backups, media archives and for home-internal sharing. Even with a new 1.5 terabyte disk installed some years ago, it was running out of disk space. The old Windows 2008 Server was also getting painfully slow.

New server components (August 2017)
New server components (August 2017)

I decided to do bit of an update, and got a “small” 120 GB SSD for the new system, and a WD Red 4.0 terabyte NAT disk for data. (I also considered their 8 TB “Archive” disk, but I do not need quite that much space, yet, and the “Red” model was a bit faster for my general purpose use. It was also cheaper.)

This time I decided to go Linux way – my aging dual-core Xeon based system is more suitable for a bit lighter OS than a full Windows Server installation. On the other hand I was curious to try newer Linux distributions, so I picked up the “elementary OS”, which has attracted some positive press recently.

HP Proliant ML110 G5, opened
HP Proliant ML110 G5, opened.

The hardware installation took it’s time, but I must say that I respect the build quality of this budget-class Proliant ML110 Gen5 machine. It has been running soon ten years without a single issue (hardware-related, I mean), and it is very solid, and pleasure to open and maintain (something that cannot be said of several consumer oriented computers that I have used).

Installing elementary OS ("loki")
Installing elementary OS (“loki”)

Also the Linux installation, with my Samba and Dropbox components is now finally up and running. But I have to say that I am a bit disappointed with the elementary OS (0.41 “loki”) at the moment. It might have been wrong distribution for my needs, to start with. It surely looks pretty, but it is also very restricted – many essential administrative tools or features are disabled or not available, by design. Apparently it is made so easy and safe for beginners that it is hard to use this “eOS” for most things that Linux normally is used for: development, programming, systems administration.

It is possible to tweak Linux installations, of course, and I have now patched or hacked the new system to be more allowing and capable, but some new issues have emerged in the process. I wonder if it is possible just to overwrite the “elementary” into a regular Ubuntu Server version, for example, or do I need to reinstall everything and lose the work that I have already done? I need to study the wonderful world of Linux distros a bit more, obviously.

Yoga 510, Signature Edition

2017-07-30 18.39.57At home, I have been setting up and testing a new, dual-boot Win10/Linux system. Lenovo Yoga 510 is a budget-class, two-in-one device that I am currently setting up as a replacement for my old Vivobook (unfortunately, it has a broken power plug/motherboard, now). Technical key specs (510-14ISK, 80S70082MX model, Signature Edition) include an Intel i5-6200U processor (a 2,30-2,80 GHz Skylake model), Intel HD Graphics 520 graphics, 4 GB of DDR4 memory, 128 GB SSD, IPS Full HD (1920 x 1080) 14″ touch-screen display, and a Synaptics touchpad and a backlit keyboard. There is a WiFi (802.11 a/b/g/n/ac) and Bluetooth 4.0. Contrasted to some other, thinner and lighter devices, this one has a nice set of connectors: one USB 2.0, two USB 3.0 ports (no Thunderbolt, though). There is also a combo headphone/mic jack, Harman branded speakers, a memory card slot (SD, SDHC, SDXC, MMC), 720p webcam, and a HDMI connector. There is also a small hidden “Novo Button”, which is needed to get to the BIOS settings.

This is a last-year model (there is already a “Yoga 520” with Kaby Lake chips available), and I got a relatively good deal from Gigantti store (499 euros). (Edit. I forgot to mention this has also a regular, full size wired gigabit ethernet port, which is also nice.)

The strong points (as contrasted to my trusty old Vivobook, that is) are: battery life, which according to my experience and Lenovo promises is over eight hours of light use. The IPS panel is not the best I have seen (MS Surface Pro has really excellent display), but it is still really good as compared to the older, TN panels. Multi-touch also operates pretty well, even if the touchpad is not so much to my taste (its feel is a bit ‘plasticky’, and it uses inferior Synaptics drivers as contrasted to the “precision touchpads”, which send raw data directly to Windows to handle).

2017-08-01 19.21.39The high point of Lenovo Thinkpad laptops has traditionally been their keyboards. This Yoga model is not one of the professional Thinkpad line, but the keyboard is rather good, as compared to the shallow, non responsive keyboards that seem to be the trend these days. The only real problem is the non-standard positioning of up-arrow/PageUp and RightShift keys – it is really maddening to write, and while touch-typing every Right-Shift press produces erroneous keypress that moves the cursor up (potentially e.g. moving focus to “Send Email” rather than to typing, as I have already witnessed). But this can sort of be fixed by use of KeyTweak or similar tool, which can be used to remap these two keys to other way around. Not optimal, but a small nuisance, really.

2017-07-30 18.41.48Installing dual boot Ubuntu requires the usual procedures (disabling Secure Boot, fast startup, shrinking the Windows partition, etc.), but in the end Linux runs on this Lenovo laptop really well. The touch screen and all special keys I have tested work flawlessly right after the standard Ubuntu 17.04 installation, without any gimmicky hacking. Having a solid (bit heavy though) laptop with a 14-inch touch-enabled, 360 degree rotating screen, and which can be used without issues in the most recent versions of both Windows 10 and Linux is a rather nice thing. Happy with this, at the moment.

Note on working with PDFs and digital signatures

Adobe Global Guide to Electronic Signature Law
Adobe Global Guide to Electronic Signature Law

Portable Document Format (PDF) files are a pretty standard element in academic and business life these days. It is sort of a compromise, a tool for living life that is partly based on traditional paper documents and their conventions, and part on new, digital functionalities. A PDF file should maintain the appearance of the document same, as moved from device to device and user to user, and it can facilitate various more advanced functionalities.

One such key function is ability to sign a document (an agreement, a certificate, or such) with digital signatures. This can greatly speed up many critical processes in contemporary, global, mobile and distributed lives of individuals and organisations. Rather than waiting for a key person to arrive back from trip to their office, to physically use pen and paper to sign a document, a PDF document version of the document (for example) can be just mailed to the person, who then adds their digital signature to the file, saves, and sends the signed version back.

In legal and technical terms, there is nothing stopping from moving completely to using digital signatures. There are explanations of the legal situation e.g. here:

And Adobe, the leading company in electronic documents business, provides step-by-step instructions on how to add or generate the cryptographic mechanisms to ensure the authenticity of digital signatures in PDFs with their Acrobat toolset:

According to my experience, most contracts and certificates still are required to be signed with a physical pen, ink, and paper, even while the digital tools exist. The reasons are not legal or technical, but rather rooted in organisation routines and processes. Many traditional organisations are still not “digital” or “paperless”, but rather build upon decades (or: centuries!) of paper-trail. If the entire workflow is built upon the authority of authentic, physically signed contracts and other legal (paper) documents, it is hard to transform the system. At the same time, the current situation is far from optimal: in many cases there is double work, as everything needs to exist both as the physical papers (for signing, and for paper-based archiving), and then scanned into PDFs (for distribution, in intranets, in email, in other electronic archives that people use in practice).

While all of us can make some small steps towards using digital signatures and get rid of the double work (and wasting of natural resources), we can also read about the long history of “paperless office” – a vision of the future, originally popularized by a Business Week article in 1975 (see: https://en.wikipedia.org/wiki/Paperless_office and the 2001 critique by Sellen & Harper: https://mitpress.mit.edu/books/myth-paperless-office).

And, btw, a couple of useful tips:

Brydge 12.3, Surface Pro 4

Surface Pro 4, with Brydge 12.3 and MS Type Cover
Surface Pro 4, with Brydge 12.3 and MS Type Cover

Getting the input right is one of the most challenging issues in todays world of pervasive, multimodal computing and services. Surface Pro 4 is an excellent multitouch tablet, and with the Surface Pen it is perfect for review and marking (key elements in academic life). The problem with a tablet as a main computer is that much of the productivity oriented tasks really call for a mouse and keyboard style approach.

There are pretty good add-on keyboards for today’s tablet computers, and one can of course also attach to a Surface Pro a full size keyboard and mouse combo. However, a keyboard cover that is always with you is the optimal companion for a tablet user. The official Type Cover by Microsoft is a really good compromise: it is thin, light, has decent keys, excellent touchpad, and backlight, which is really important for business use. There is certain wobbly, flexible quality in the keys though, and writing a whole day with one can create certain strain.

I have now tested a new, much more solid alternative: Brydge 12.3 keyboard cover. It is made of strong aluminium, has 160 degrees rotating hinges that create a firm grip on the corners of the tablet, and its island style keys also are backlighted. According to my experience, the usability issues with Brydge relate to the unreliability of Bluetooth connection on one hand – sometimes I would spend several minutes after tablet wake-up waiting for keyboard to re-establish its connection. Other thing is that the integrated touchpad is rather bad. It is hard to control precisely, pointer movement is wobbly, and not all Windows 10 mouse gestures are supported. It is also very small by today’s standards, and clicks register randomly. The sensible use for the Brydge is to use it alongside a wired or wireless mouse – this, however, diminishes its value as a real laptop replacement option. The trackpad in Type Cover is so much better that in regular use that in the end it trumps Brydge’s better (or at least more solid) keyboard. The plus side of using Brydge is that in tactile terms, it transforms Surface Pro into a (small and heavy) laptop computer.

It is apparently hard to get a 2-in-1 device right. However, multiple manufacturs have recently introduced their own takes on the same theme, so there might be better options out there already.

Surface Pro 4, with Brydge 12.3 and MS Type Cover
Surface Pro 4, with Brydge 12.3 and MS Type Cover