Adaptive reconfigurable structures are seen as the next big step in the evolution of architecture. However, to achieve this vision, new tools are required that enable autonomous configuration of given elements based on a specified design objective. Various approaches have been considered in the past, ranging from rule-based methods to evolutionary optimization. Although successful in applications where search heuristics or informative objective functions can be provided, these methods struggle with long-term planning problems. In this paper, we tackle the problem of sequential assembly of SL-blocks which has the character of a combinatorial optimization problem. We explore the applicability of deep reinforcement learning algorithms that recently showed great success on combinatorial problems in other domains, such as board games and molecular design. We highlight the unique challenges presented by the architectural design setting and compare the performance to evolutionary computation and heuristic search baselines.